[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [relax-ng-comment] InstanceToSchema 1.0
Hi, I am pleased to announce InstanceToSchema 1.0 [1] InstanceToSchema is a RELAX NG schema generator from XML instances. It is a command line tool, written in java. It needs J2SE 1.3 or 1.4 and a JAXP compliant SAX parser for running. InstanceToSchema is developed inside the xmloperator project [2] and shares its BSD style license but is packaged and can be used independently from the XML editor. The software is based on pattern categories. A pattern category represents a set of RELAX NG patterns. The tool work consists in building for each element name a pattern category that is compatible with all the input XML instances and is as precise as possible. The following pattern category types are implemented : * An EmptyPatternCategory represents contents with no element. There may be only attributes and/or text. * An (OptionalRepeatable)ElementPatternCategory represents contents with one element or several elements but with the same name. There may also be attributes and/or texts. * A GroupPatternCategory represents ordered contents or choice between ordered contents. There may also be attributes and/or texts. * An InterleavePatternCategory represents unordered contents. Some element names may appear several times, some others may not. There may also be attributes and/or texts. All these pattern categories consider elements and attributes as independent. However the tool framework doesn't require that. New pattern categories could correlate elements and attributes. Another thing the tool does not is inferencing datatypes. The tool is suitable for processing large documents. It uses only one SAX parsing pass. The required memory space depends on the element name count and the complexity of patterns, not the document size. The set of pairs (element name, pattern category) is translated to a RELAX NG simple syntax data model (the same is used by the XML editor), which is converted to a more readable full syntax and writed out with indentation. A typical use case consists to obtain a description of the structure of one or several (combined) XML files. From my point of view, such a schema is not suitable for validating or for guiding editing some document. I hope that this tool can be usefull or incite some developer to do better. I would welcome any comment. Regards, Didier Demany didier.demany@xmloperator.net The_xmloperator_project [1] http://www.xmloperator.net/i2s/ [2] http://www.xmloperator.net/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC