[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [xacml] regex in the spec
seth proctor wrote: > Ah, you're talking specifically about the new version number string. > Since I added this string, I guess I can comment on it :) In this case, > I'm not using Perl or POSIX, but the format specifically defined in > XMLSchema. There is no ambiguity about what form this is in. Patterns in > XMLSchema must use the XMLSchema regular expression format. ok, i see what you are using. the problem i have here is that xml schema uses a *subset* of an externally referenced regular expression definition (Unicode Regular Expression Guidelines, Level 1) that does not meet the needs of a general regular expression mechanism (which is why xquery has provided for additions to the xml schema regex syntax to achieve its goals). > Actually, "1." is not valid by the current expression, since the pattern > says that a version string must end in a digit. ah, you are right. i stand corrected. > As for the specifics of > the pattern, I don't care too much about how we form it. I used the > current string because it's the most common way to phrase something like > this in XMLSchema, therefore I believe it will be the most accessible. for you maybe ;o) the xml schema spec provides examples and definitions that use the syntax i proposed (e.g. the sections on Lexical & Canonical Representations and the concept of patterns themselves). this is not to suggest that we must do this (the same spec has an example using the more advanced pattern string notation), but that there is no mention of a preference one way or the other. i also maintain that the expanded numeric notation ('[0-9]+' vs '\d')works on ANY system that supports regular expressions. again, a preferential position rather than strict adherence to the specification referenced, but not without merit; an optimum solution should allow for either. > The idea here is _not_ just to provide a direct match. You should look > at the text I added to go with this that explains exactly what this > pattern is used for. This is a very simple wildcard that only lets you > form a few patterns. Here we wouldn't want to use a full-featured regexp > language, since we only want people to say one of a few things: > > 1.2.4 > 1.+ > 1.*.4 > 1.2.* > which all match the string "1.2.4.". I know of no existing language we > could reference that only provides these limited options. i don't understand this. the syntax you provided for version number: (\d+\.)*\d+ allows the following: 1 1.2 1.2.3.4.5.6.7.8 (num.num.num...) why not rely simply upon the same regular expression reference that was used to construct it? what are we protecting people from? if you can only write version information in the form: + +.+ +.+ ... who cares how verbose the matching expression is (so long as it conforms to some common general definition)? > The pattern strings are written in XMLSchema, and they express > very specific, very (intentionally) limited meaning that I provided > clear text to explain. again, i suggest that this imposes restrictions that are not tangibly beneficial. while you may find the 'intentional' limitation comforting, it may require that others perform an additional operation to match this format, even though those systems write and read regular expressions that are 100% with the xml schema specification you are citing. > Do you have something in mind that you think would be more > appropriate here? two things: 1. i would prefer that we directly reference the regular expression semantics that the current version of xml schema *refers* to. this would add a reference into our spec: Unicode Regular Expression Guidelines Mark Davis. Unicode Regular Expression Guidelines, 1988. Available at: http://www.unicode.org/unicode/reports/tr18/ OR reference the xquery regular expression [xml schema] superset so as to be consistent with our functions: Regular Expression Syntax W3. XQuery 1.0 and XPath 2.0 Functions and Operators http://www.w3.org/TR/xpath-functions/#regex-syntax either would require a simple statement on the use of regular expressions in *general* (i will write it if the idea is acceptable). given the high level of dependency upon xquery in our spec, perhaps the latter, while limited--and fluid based on its dependence upon xml schema, which itself depends upon the reference above--would offer the better solution. 2. remove restrictions on the syntax of the matching patterns for version comparisons. the definition of the version number provides a consistent numeric structure and how an implementer decides to check that it matches should only be constrained by the format of the data itself (and the use of explicit regular expression semantics). i suspect that we will be seeing the use of regular expressions more frequently as the spec matures and i believe that version will likely end up setting a precedent for how this is done in the future. i would prefer that we not get in the habit of further subsetting the syntax of the matching patterns for reasons of flexibility and effort. > Since no one commented when I made the original suggestion, > I assumed that people approved. sorry, review periods are like that :o) b
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]