[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: [xtm-wg] XTM Whitespace Handling
Quoting (just as an example) the documentation from org.apache.xml.serialize: > [...] > For elements that are not specified as whitespace preserving, > the serializer will potentially break long text lines at space > boundaries, indent lines, and serialize elements on separate > lines. Line terminators will be regarded as spaces, and > spaces at beginning of line will be stripped. This particular Java class might be typically used to serialize a parsed XTM document back into XML syntax. It's only typical; almost every XML application I've seen has some sort of functionality similar to this one. The reason I bring this up is that I'm concerned that we've underspecified whitespace handling. It's an area that seems to bite everybody. We have PCDATA in two places -- <baseNameString> and <resourceData> -- and we've declared it all significant in the prose of the spec (I believe), yet not done anything in the DTD to recommend to XML applications that it be preserved. The XML feature that does this the "xml:space" attribute defaulted on an element to "preserve". I've had to modify local copies of the XTM DTD in order to keep the apache serializer from altering the whitespace in my documents. The big question really is: is whitespace significant in XTM documents? Do the base names "Niagara Falls" " Niagara Falls" "Niagara Falls" "Niagara Falls " "Niagara<tab>Falls" all match? If we don't actually alter anything in the XTM Specification, we ought to at least give application developers a clue on how we think this should be handled. I seem to remember us discussing this at one point, but can't remember the outcome. And I'm certain a public airing of this issue would benefit other XTM developers. My recommendation (which I'm certainly open to discussing) would be to add xml:space (default|preserve) 'preserve' to those elements which we explicitly state that whitespace *is* significant. XML parsers will pass all whitespace, but after that, XML applications can do what they want. Since we might expect that XTM documents go through various processing stages, shouldn't we do something about this? And if I'm totally wrong about this, could somebody give *me* the clue? Murray ........................................................................... Murray Altheim <mailto:altheim@eng.sun.com> XML Technology Center Sun Microsystems, Inc., MS MPK17-102, 1601 Willow Rd., Menlo Park, CA 94025 In the evening The rice leaves in the garden Rustle in the autumn wind That blows through my reed hut. -- Minamoto no Tsunenobu ------------------------ Yahoo! Groups Sponsor ---------------------~-~> Make good on the promise you made at graduation to keep in touch. Classmates.com has over 14 million registered high school alumni--chances are you'll find your friends! http://us.click.yahoo.com/n4HqaC/DMUCAA/4ihDAA/2n6YlB/TM ---------------------------------------------------------------------_-> To Post a message, send it to: xtm-wg@eGroups.com To Unsubscribe, send a blank message to: xtm-wg-unsubscribe@eGroups.com Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC