OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: [relax-ng-comment] InstanceToSchema 1.0


Hi,

I am pleased to announce InstanceToSchema 1.0 [1]

InstanceToSchema is a RELAX NG schema generator from XML instances.

It is a command line tool, written in java. It needs J2SE 1.3 or 1.4 and a
JAXP compliant SAX parser for running.

InstanceToSchema is developed inside the xmloperator project [2] and shares
its BSD style license but is packaged and can be used independently from
the XML editor.

The software is based on pattern categories. A pattern category represents
a set of RELAX NG patterns. The tool work consists in building for each
element name a pattern category that is compatible with all the input XML
instances and is as precise as possible.

The following pattern category types are implemented :

 * An EmptyPatternCategory represents contents with no element. There may
be only attributes and/or text.

 * An (OptionalRepeatable)ElementPatternCategory represents contents with
one element or several elements but with the same name. There may also be
attributes and/or texts.

 * A GroupPatternCategory represents ordered contents or choice between
ordered contents. There may also be attributes and/or texts.

 * An InterleavePatternCategory represents unordered contents. Some element
names may appear several times, some others may not. There may also be
attributes and/or texts.

All these pattern categories consider elements and attributes as
independent. However the tool framework doesn't require that. New pattern
categories could correlate elements and attributes. Another thing the tool
does not is inferencing datatypes.

The tool is suitable for processing large documents. It uses only one SAX
parsing pass. The required memory space depends on the element name count
and the complexity of patterns, not the document size.

The set of pairs (element name, pattern category) is translated to a RELAX
NG simple syntax data model (the same is used by the XML editor), which is
converted to a more readable full syntax and writed out with indentation.

A typical use case consists to obtain a description of the structure of one
or several (combined) XML files. From my point of view, such a schema is
not suitable for validating or for guiding editing some document.

I hope that this tool can be usefull or incite some developer to do better.
I would welcome any comment.

Regards,

Didier Demany
didier.demany@xmloperator.net
The_xmloperator_project

[1] http://www.xmloperator.net/i2s/

[2] http://www.xmloperator.net/



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC