relax-ng-comment message

Subject: Re: [relax-ng-comment] Validation of a document during contruction

From: Rick Jelliffe <ricko@topologi.com>
To: relax-ng-comment@lists.oasis-open.org
Date: Tue, 23 Apr 2002 19:00:24 +1000

 From: "Peter Wilson" <PWILSON@GORGE.NET>
 
> The problem seems to be that validity is a global property of a Relax NG
> pattern. It is very hard to point to a single attribute or element and
> say: If  X then Y else Z. Perhaps a certain value for a required
> element, plus the presence of an optional attribute, but only when an
> ancestor of a particular element you can enter a Z element here.
> 
> Question to this group: How best to proceed?

In May, at XML Europe 2002 and I giving a talk "When well-formedness is
too much and validity is not enough" which will deal with various aspects
of validation and schemas, especially with regard to editors.  (My company
is making one, so I have been thinking a lot about it.)

There are various kinds of validity, such as "subsequence
valid" (valid against a content model up to a point), "weakly valid"
(parents can be inserted to make the document valid: there is a patent
on a technique for this), "minimally valid" (all required elements are
present, and some optional elements are there), "order-valid"
(the children accord to some ordering derived from a content model),
"feasible" (siblings could be added which then satisfies the content
model) , "child-valid" (element is allowed as a child but order is
not considered), "occurrence-valid" (a certain number of 
some element are allowed, but position has not been checked),
"path-valid" (some XPath is satisfied), and so on.  (Not to mention
"lax validation" etc)

A particular constraint expressed in a schema language can be
implemented partially or fully by expressing it as each kind of 
validity. In an editor, the trick is to figure out which kinds of
validity the user is best served by at each stage.  

So it all comes down to having a model of how people actually
edit. As far as I can see, most XML editors (there are scores)
have been written along these lines "XML is a tree, I have tree
widgets, therefore I can make an editor".  The resulting editors
do not seem particularly useful to me.  Instead, a better approach
is to figure out *how* people edit, and supply tools for that. 

For example, grammar validators often stop at the first error; 
so to use that kind of validator means you are accepting that
the user wishes to work from in document order on their document.
Adam Smith would roll in his grave. The division of labour
(first I will finish the tables, then I will do divisions, then I will
do metadata) is how anyone rational deals with complex tasks.

So one important aspect of adopting schema languages for editors
is knowing what kinds of localizing of errors you can do. Namespaces
are pretty helpful with this, because they represent a nice boundary:
your xhtml: document may be invalid, but you should be able
to work through the mathml: fragments independently, for example.
Within particular schemas (without global inclusions/exclusions) 
any element which only has a single (global or local) definition
can be validated as a branch independent of its parent and sibling
validity. 

Cheers
Rick Jelliffe

References:
- [relax-ng-comment] Validation of a document during contruction
  - From: Peter Wilson <PWILSON@GORGE.NET>