[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: [relax-ng] Proposal for a regex module, draft 0.1
Looks good. If we were to provide a built-in regex capability for RELAX NG, I think there are compelling advantages to doing it in this sort of way using the existing element set: - easy of learning/use - ease of implementation; I haven't verified this yet, but I strongly suspect that something like this would be relatively little extra work for a RELAX NG implementation, because a lot of the existing parsing/validation machinery for elements/attributes could be easily reused However, doing it as an annotation loses one thing that I think is very important. If <regex> was a standard RELAX NG element (like <list>) then you could compose complex regexes using define/ref. This would be a massive win. Complex regexes in XML Schema are incredibly hard to understand, debug and maintain because of the lack of the define/ref mechanism. This is something we should seriously consider for RELAX NG 2.0. ----- Original Message ----- From: "John Cowan" <cowan@mercury.ccil.org> To: <relax-ng@lists.oasis-open.org> Sent: Friday, March 01, 2002 11:50 AM Subject: [relax-ng] Proposal for a regex module, draft 0.1 > I ran with James's idea in the telcon today about expressing regexes > via the existing element set. I propose the following schema > for them. Note that this is an annotation rather than a change in RELAX NG. > > default namespace = "http://relaxng.org/ns/structure/1.0" > namespace r = "http://relaxng.org/ns/structure/regex/1.0" > > start = element r:regex {foreign, regex} > > regex = > element oneOrMore | zeroOrMore | optional | choice | group > {foreign & regex+} | > element value {foreignAttribute*, string} | > anyChar > > anyChar = > element r:anyChar { > ((attribute from {string}, attribute to {string})? | > attribute charType {string})?, > foreign & element except {foreign & anyChar+} > > foreign = foreignAttribute*, foreignElement* > > where foreignAttribute and foreignElement are as in RELAX NG. > > The r:regex element can be a child of data (for regular expressions > in content/attribute values) or of anyName/nsName (to specify a > local-name-part via a regular expression). > > The r:anyChar element matches any character if there are no > attributes. It matches any character between "from" and "to" > (single characters) if these attributes are present. It matches > any char in a named class such as "Ps" or "IsDevanagari" if the charType > attribute is present. The except child element allows subtraction; > for example, "[^A-Z]" is <anyChar><except><anyChar from="A" to "Z"/> > </except></anyChar>. > > Comments? > > -- > John Cowan http://www.ccil.org/~cowan cowan@ccil.org > To say that Bilbo's breath was taken away is no description at all. There > are no words left to express his staggerment, since Men changed the language > that they learned of elves in the days when all the world was wonderful. > --_The Hobbit_ > > ---------------------------------------------------------------- > To subscribe or unsubscribe from this elist use the subscription > manager: <http://lists.oasis-open.org/ob/adm.pl> > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC