xslt-conformance message

Subject: comparison current update - TEXT

From: Kirill Gavrylyuk <kirillg@microsoft.com>
To: xslt-conformance@lists.oasis-open.org
Date: Tue, 21 Aug 2001 09:29:42 -0700

Some update on comparison technique for TEXT method. 

We proceed with TEXT method similar as we do with XML output method. We
produce XML document ("infoset" of text) that contains only info about
original text relevant for comparison. For text it means that we drop
difference between platform dependant representation of line breaks.
Example:

original text document:

This is the text 
we want to 
compare

This is corresponding "text infoset"

<text encoding="US-ASCII">
<line>This is the text </line>
<line>we want to </line>
<line>compare</line>
</text>

Text infoset is always serialized in "UTF-8" leaving original encoding
info on the root node. 
Note: In the first version I propose to ignore encoding info when do
actual comparison. We can still put it for reference.
 
Note: Drawback of this approach is that there is no standard way to
create such infoset - though it is simple and achievable in any script
environment. Reason to produce XML as "text infoset" is to be able to
reuse same consistent "canonical" serialization method we use in XML
comparison.