[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: OpenDocument Draft 3: Relax-NG Schema Validation Errors
Darien, we have used jing (with option -i) as validator, and did not experience any errors. We also noticed the issue MSV has with the "$" characters in regular expressions. Best regards Michael Darien Kindlund wrote: > Hello, > > I'm in the process of looking at how to properly validate an > OpenDocument XML file (written by OpenOffice), using the > office-schema-1.0-cd-3.rng syntax as a basis. It seems I'm running into > some schema validation errors (depending on the relax-ng validator used) > and would appreciate any comments/suggestions. I'm not enrolled in the > corresponding mailing list; if you could CC me directly with any > responses, I would appreciate it. > > > Here are the process steps I followed: > > - Downloaded OpenDocument schema from URL: > http://www.oasis-open.org/committees/download.php/11680/office-schema-1.0-cd-3.rng > > > - Downloaded the Sun Multi-Schema XML Validator (that supports Relax-NG > schema validation) > http://www.sun.com/software/xml/developers/multischema/ > as it was referenced as a valid Relax-NG validator on this page: > http://relaxng.org/#validators > > - Created a simple OpenDocument using OpenOffice.org 2.0 Beta; extracted > and aggregated the XML contents into a single XML file (Testing.xml). > > - Ran the following command: > java -jar msv.jar -warning -strict office-schema-1.0-cd-3.rng Testing.xml > > Output is as follows: > start parsing a grammar. > > cannot set parameter pattern to this datatype: specified pattern is > invalid: Unexpected meta character. > 3838:31@file:///C:/Temp/office-schema-1.0-cd-3.rng > > invalid parameter setting: specified pattern is invalid: Unexpected meta > character. > 3837:25@file:///C:/Temp/office-schema-1.0-cd-3.rng > > cannot set parameter pattern to this datatype: specified pattern is > invalid: Unexpected meta character. > 3843:31@file:///C:/Temp/office-schema-1.0-cd-3.rng > > invalid parameter setting: specified pattern is invalid: Unexpected meta > character. > 3842:25@file:///C:/Temp/office-schema-1.0-cd-3.rng > > failed to load a grammar. > > The X:Y numbers correspond to the row:column within the > office-schema-1.0-cd-3.rng file. In this case, it is referring to the > "cellAddress" and "cellRangeAddress" element types, as explained on > pages 189 and 190 in Section 8.3.1 of the corresponding documentation. > > Specifically, this isn't a bug with the OpenDocument schema nor with > MSV; it's a bug with Apache Xerces (v2.6.2) (since MSV leverages > Xerces). Specifically, Xerces decides to treat the "$" and "^" > characters in regular expressions as "metacharacters" (equivalent to > Perl), whereas the W3C XML Schema datatype specification says they are > not. Details of this Xerces bug are listed here: > http://issues.apache.org/jira/browse/XERCESJ-1061 > > As a workaround, I modified the two regex's within the OpenDocument > schema, by escaping the "$" that was the culprit ("$" -> "\$"); however, > the Relax-NG schema validator is still yielding nonsensical error messages. > > In fact, after trying 2 other independently-developed RNG validators; > I'm getting inconsistent validator errors for each (Jing and oXygen). At > this point, I'm skeptical to believe the OpenDocument schemas has flaws, > based upon this evidence. The only common denominator I can see is that > the "schema error" may revolve around the validators inability to handle > the following two recursive definitions: > > <define name="mathMarkup"> > <zeroOrMore> > <choice> > <attribute> > <anyName/> > </attribute> > <text/> > <element> > <anyName/> > <ref name="mathMarkup"/> > </element> > </choice> > </zeroOrMore> > </define> > > -AND- > > <define name="anyAttListOrElements"> > <zeroOrMore> > <attribute> > <anyName/> > <text/> > </attribute> > </zeroOrMore> > <ref name="anyElements"/> > </define> > <define name="anyElements"> > <zeroOrMore> > <element> > <anyName/> > <mixed> > <ref name="anyAttListOrElements"/> > </mixed> > </element> > </zeroOrMore> > </define> > > I would appreciate any feedback/suggestions regarding this issue. If you > happen to know of a better RNG validator for OpenDocument files, I could > try and replicate the issue with that validator as well. I'm not an > expert in RNG syntax in general, so if the problem is reproducible and > there's an evident error within the schema, I would appreciate any > explanations/clarifications. > > Regards,
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]