OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [relaxng-user] repeated data


On Thu, 2003-11-27 at 07:51, David Tolpin wrote:
> > David Tolpin scripsit:
> > 
> > > element foo { "a"+ }
> > > 
> > > is an invalid pattern. However, trang accepts it and converts into
> > > corresponding XML Syntax.
> > 
> > It is indeed invalid.  Trang is not guaranteed to catch errors in input, and
> > when the output is xsd, it may generate erroneous output as well.
> > It's a best-effort program.
> 
> Thank you, I realize it now that since grouping restrictions are specified for
> the simple form,  and trang does not translate to the simple form. 

The groupability restrictions are specified in terms of the simple form,
but I think you can implement them without actually doing the
translation, so long as you have all the definitions available.  Trang
ought to do this,  but it's not high on my priority list since people
can easily use jing to check if they want.  Also trang can  sometimes do
translations between RNG and RNC even when the schema is not correct
e.g. because there are some undefined references.  This allows you to
translate a part of a schema separately, which is sometimes quite
useful.

> Interestingly, nXML does not check restrictions either,

nXML doesn't yet check any of the Section 7 restrictions. This is a
documented limitation.

> and successfully validates
> against this grammar (that is, it allows sequences of a, but not other strings).

I don't think so.

> Why this restriction is imposed at all?

The semantics of

  element foo { "a"+ }

are not very clear.  As the spec is currently written, "a" matches a
text node whose value is "a". An element

  <foo>aaa</foo>

has one child text node with value "aaa" not three text nodes each with
value "a".  Thus

  element foo { "a"+ }

would be exactly equivalent to

  element foo { "a" }

What the restriction stops you expressing is something like:

  element foo { "a"+ & element bar { empty }* }

which would match things like:

  <foo>a<bar/>a<bar/>a</foo>

Apart from being something that nobody in their right mind should want
to do, I think this would make it hard to avoid SGML pernicious mixed
content type problems (i.e. problems where white space that you would
expect not to affect validation does affect validation).

James




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]