OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Should empty elements allow whitespace?

Should a pattern

  <element name="foo"><empty/></element>

match an element

 <foo>  </foo>

?  At the moment, there is an inconsistency between the TREX
implementation and the TREX specification.  The specification does not
allow this, because whitespace is ignored in the rule for the "element"
element as follows:

M[[<t:element> name-class pattern </t:element>]](a, c, e) if and only if 

+ a is {}, and
+ c consists of a zero or more whitespace characters, followed by an
element   <n,a1,c1>, followed by zero or more whitespace characters, and 
+ C[[name-class]](n), and 
+ M[[pattern]](a1, c1, e) 

On the other hand, the implementation does allow this.

SGML with WWW amendment doesn't allow anything in an empty element (not
whitespace, not entity references, not comments).  The XML 1.0 spec
isn't very clear on this, but in the past the policy has been to
maintain compatibility with SGML with respect to validity.

Being compatible with SGML/XML in this area would imply that TREX would
not allow PIs and comments and entity references in an empty elements. 
I believe that would be fundamentally wrong: validation behaviour should
not as a matter of principle be affected by comments or entity
structure.  (It would also make it impossible to implement using just
the standard SAX 2 interfaces, since these don't report internal entity
references.)  Since we can't be compatible here, I think we should just
do whatever we consider to be the right thing.

My feeling is that the general rule should be that if you have two
adjacent tags (start-tags or end-tags), then you should be able to add
whitespace between them without affecting validity.  Therefore my
preference would be to allow whitespace within empty elements.

I think this can be implemented in the spec by changing the last bullet
for "element" above to:
+ M[[pattern]](a1, strip(c1), e)

where strip(s) is defined to be

- the empty sequence, if s consists of nothing but whitespace
characters, and
- s, otherwise


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC