OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [relax-ng-comment] Full to simple syntax converter?



> I'm working on a RELAX NG validator for Python.  Writing the
> validation engine took only an afternoon, thanks to the excellent
> description of Jing's algorithm; writing a parser for the simple
> syntax also only took a few hours.  Implementing the full
> syntax... er, it's been a couple of weeks now and still counting...

It sounds like I should write a description of the algorithm for converting 
the full to simple syntax.  It's not that hard. Can you say which bits of 
the conversion are giving you trouble?

I guess the hardest bit is handling grammar/define/ref. Let me give some 
hints here:

1. The result of the initial parse of the full syntax is a Pattern object 
representing the schema that differs from the simple syntax pattern only in 
that it contains an additional kind of Pattern, namely a RefPattern.  A 
RefPattern is a Pattern containing a reference to another Pattern.

During parsing maintain a current Grammar object.  A Grammar object contains

- a reference to the start pattern
- a map from NCNames to RefPatterns
- a reference to a parent grammar

When you see a <grammar> start-tag, create a new Grammar object that 
becomes the new current Grammar.  Initially the map is empty, the start 
pattern is null, and the parent grammar is the old current grammar.

When you see a <define> element, look up the name in the current grammar's 
map. If it doesn't exist, add a new RefPattern referring to the body of the 
definition.  If it does exist, then check with the RefPattern's pattern 
reference is null; it is is null, then change it to refer to the body of 
the definition; if it's not null, then we have a duplicate definition.

Handle <start> similarly to <define>.

When you see a <ref> element, lookup the name in the current grammar's map. 
If it doesn't exist, add a new RefPattern with a null pattern reference to 
the map. In either case, return that RefPattern as the result of parsing 
the <ref> element.

When you see a <parentRef> do the same as <ref> except use the parent 
grammar's map.

At the </grammar> end-tag, check that the start of the current grammar is 
non-null. Also walk the map to check that all RefPatterns have non-null 
pattern references. Return the start pattern as the result of parsing the 
<grammar> element.

The above doesn't handle include/combine.  There's a message in the 
archives from Kohsuke giving an algorithm for this.

2.  The next task is to check that there is no illegal recursion. For this 
it is convenient to add a checkRecursionDepth field to the PatternRef 
object. This should be initialized to -1.  Define a checkRecursion(int 
depth) method on Patterns. By default this simply calls checkRecursion with 
the same argument on all subpatterns.  For an ElementPattern it calls 
checkRecusion(depth + 1). For a PatternRef pattern it does this:

  checkRecursion(int depth) {
    if (checkRecursionDepth == -1) {
      checkRecursionDepth = depth;
      referencedPattern.checkRecursion(depth);
      checkRecursionDepth = -2;
    }
    else if (depth == checkRecursionDepth)
      error("illegal recursion");
  }

To check for illegal recursions, call checkRecursion(0) on the pattern for 
the schema as a whole.

Does this help?

3. After determining that there is no illegal recursion, the next stage is 
to eliminate RefPatterns.  Do this with an expand() method on Patterns. It 
is convenient to add a boolean expanded field to ElementPatterns.  This is 
initialized to false. For an ElementPattern the code for expand is

Pattern expand() {
  if (!expanded) {
    expanded = true;
    content = expand(content);
  }
  return this;
}

For a RefPattern it would be simply

Pattern expand() {
  return referencedPattern.expand();
}

For a Text/Data etc it would be simply

Pattern expand() {
  return this;
}

For a Choice it would be

Pattern expand() {
  return makeChoice(operand1.expand(), operand2.expand());
}

> Is there any software that converts a schema in the full syntax to the
> simple syntax?

There's some code in Jing (PatternDumper) that almost does this, but it 
doesn't handle data and value properly.  Fixing this is on my TODO list.

James




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC