OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xslt-conformance message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: comparison update - XML


Some update for comparison methodology for " xslt output method="XML"
For XML:
Our comparison methodology for any output methods consist of 2 steps: 
a) transforming output to simple "infosetized" document that contains
only information about original output relevant for comparison;
b) serializing "infosetized" document in consistent "canonical" way -
the following must be true for serialization method:
    same harness should serialize equal infosetized documents into equal
sequence of bytes

For XML output we use subset of Xml Infoset [1] to represent xml
document structure in a simpler format, omitting details that are not
important for XSLT output comparison. We build a simpler xml document
that describes xml infoset and use only element and attribute XML
constructions.

Example. For the following xml 

<!-- Simple XML Document -->
<msg:message doc:date="19990421"
xmlns:doc="http://doc.example.org/namespaces/doc"
       xmlns:msg="http://message.example.org/"
>Phone home!</msg:message>

we produce:
<xml encoding="US-ASCII">
<comment>Simple XML Document</comment>
<element name="message" namespace-uri="http://message.example.org/">
	<namespace>http://www.w3.org/XML/1998/namespace</namespace>
	<namespace>http://doc.example.org/namespaces/doc</namespace>
	<namespace>http://message.example.org/</namespace>
	<attribute name="date"
namespace-uri="http://doc.example.org/namespaces/doc">19990421</attribut
e>
	<text>Phone home!</text>
</element>
</xml>

Infoset is always serialized in UTF-8 leaving original encoding info on
the root node.

Note: We omit some infoset items that are not relevant for xslt output
comparison, like namespace prefix, entities,  etc. We unite all
subsequent text infoset items into one. We don't distinguish between
CDATA and text nodes.

Note: current format of infoset presentation contradicts schema for
serialized infosets [2] - fixing this right now

Note: XSLT output can produce XML entity which is not well-formed XML.
Current harness (that was part of prototype draft I sent earlier[3])
accepts only well formed xmls - fixing this.

Attached is current version of infosetgen.xsl. 

[1] http://www.w3.org/TR/xml-infoset/
[2] http://www.w3.org/2001/05/serialized-infoset-schema.html
[3]
http://lists.oasis-open.org/archives/xslt-conformance/200107/msg00044.ht
ml

 <<infosetgen.xsl>> 

infosetgen.xsl



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC