OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: OpenDocument Draft 3: Relax-NG Schema Validation Errors


Darien,

we have used jing (with option -i) as validator, and did not experience 
any errors. We also noticed the issue MSV has with the "$" characters in 
regular expressions.

Best regards

Michael

Darien Kindlund wrote:

> Hello,
> 
> I'm in the process of looking at how to properly validate an 
> OpenDocument XML file (written by OpenOffice), using the 
> office-schema-1.0-cd-3.rng syntax as a basis. It seems I'm running into 
> some schema validation errors (depending on the relax-ng validator used) 
> and would appreciate any comments/suggestions. I'm not enrolled in the 
> corresponding mailing list; if you could CC me directly with any 
> responses, I would appreciate it.
> 
> 
> Here are the process steps I followed:
> 
> - Downloaded OpenDocument schema from URL:
> http://www.oasis-open.org/committees/download.php/11680/office-schema-1.0-cd-3.rng 
> 
> 
> - Downloaded the Sun Multi-Schema XML Validator (that supports Relax-NG 
> schema validation)
> http://www.sun.com/software/xml/developers/multischema/
> as it was referenced as a valid Relax-NG validator on this page:
> http://relaxng.org/#validators
> 
> - Created a simple OpenDocument using OpenOffice.org 2.0 Beta; extracted 
> and aggregated the XML contents into a single XML file (Testing.xml).
> 
> - Ran the following command:
> java -jar msv.jar -warning -strict office-schema-1.0-cd-3.rng Testing.xml
> 
> Output is as follows:
> start parsing a grammar.
> 
> cannot set parameter pattern to this datatype: specified pattern is 
> invalid: Unexpected meta character.
>   3838:31@file:///C:/Temp/office-schema-1.0-cd-3.rng
> 
> invalid parameter setting: specified pattern is invalid: Unexpected meta 
> character.
>   3837:25@file:///C:/Temp/office-schema-1.0-cd-3.rng
> 
> cannot set parameter pattern to this datatype: specified pattern is 
> invalid: Unexpected meta character.
>   3843:31@file:///C:/Temp/office-schema-1.0-cd-3.rng
> 
> invalid parameter setting: specified pattern is invalid: Unexpected meta 
> character.
>   3842:25@file:///C:/Temp/office-schema-1.0-cd-3.rng
> 
> failed to load a grammar.
> 
> The X:Y numbers correspond to the row:column within the 
> office-schema-1.0-cd-3.rng file.  In this case, it is referring to the 
> "cellAddress" and "cellRangeAddress" element types, as explained on 
> pages 189 and 190 in Section 8.3.1 of the corresponding documentation.
> 
> Specifically, this isn't a bug with the OpenDocument schema nor with 
> MSV; it's a bug with Apache Xerces (v2.6.2) (since MSV leverages 
> Xerces). Specifically, Xerces decides to treat the "$" and "^" 
> characters in regular expressions as "metacharacters" (equivalent to 
> Perl), whereas the W3C XML Schema datatype specification says they are 
> not.  Details of this Xerces bug are listed here:
> http://issues.apache.org/jira/browse/XERCESJ-1061
> 
> As a workaround, I modified the two regex's within the OpenDocument 
> schema, by escaping the "$" that was the culprit ("$" -> "\$"); however, 
> the Relax-NG schema validator is still yielding nonsensical error messages.
> 
> In fact, after trying 2 other independently-developed RNG validators; 
> I'm getting inconsistent validator errors for each (Jing and oXygen). At 
> this point, I'm skeptical to believe the OpenDocument schemas has flaws, 
> based upon this evidence. The only common denominator I can see is that 
> the "schema error" may revolve around the validators inability to handle 
> the following two recursive definitions:
> 
> <define name="mathMarkup">
>     <zeroOrMore>
>         <choice>
>             <attribute>
>                 <anyName/>
>             </attribute>
>             <text/>
>             <element>
>                 <anyName/>
>                 <ref name="mathMarkup"/>
>             </element>
>         </choice>
>     </zeroOrMore>
> </define>
> 
> -AND-
> 
> <define name="anyAttListOrElements">
>     <zeroOrMore>
>         <attribute>
>             <anyName/>
>             <text/>
>         </attribute>
>     </zeroOrMore>
>     <ref name="anyElements"/>
> </define>
> <define name="anyElements">
>     <zeroOrMore>
>         <element>
>             <anyName/>
>             <mixed>
>                 <ref name="anyAttListOrElements"/>
>             </mixed>
>         </element>
>     </zeroOrMore>
> </define>
> 
> I would appreciate any feedback/suggestions regarding this issue. If you 
> happen to know of a better RNG validator for OpenDocument files, I could 
> try and replicate the issue with that validator as well. I'm not an 
> expert in RNG syntax in general, so if the problem is reproducible and 
> there's an evident error within the schema, I would appreciate any 
> explanations/clarifications.
> 
> Regards,




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]