OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

entity-resolution message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: NID request for publicid (draft-urn-publicid-03.txt)


At 14:05 2001 05 04 -0400, John Cowan wrote:
>Editorial corrections to what Norman Walsh wrote:
>>    In addition to the character set restriction, public identifiers
>>    must be normalized by removing all leading and trailing whitespace
>>    (the characters #x20, #xD, and #xA, in this context) and replacing
>>    all remaining sequences of two or more whitespace characters with a
>>    single space (#x20).
>
>Oopsie, this isn't quite what XML 1.0 says: it would leave a single
>CR, LF, or TAB alone rather than changing it to a SPACE.

Really?  I don't think so.

SGML normalizes public ids by turning all whitespace into a space.

XML 1.0 says:

  [Definition: In addition to a system identifier, an external
  identifier may include a public identifier.] An XML
  processor attempting to retrieve the entity's content may use
  the public identifier to try to generate an alternative
  URI reference. If the processor is unable to do so, it must
  use the URI reference specified in the system literal.
  Before a match is attempted, all strings of white space in
  the public identifier must be normalized to single space
  characters (#x20), and leading and trailing white space must be removed.

What makes you think XML would leave a single CR, LF, or TAB alone?

paul



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC