OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xacml message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [xacml] regex in the spec


seth proctor wrote:
> Ah, you're talking specifically about the new version number string. 
> Since I added this string, I guess I can comment on it :) In this case, 
> I'm not using Perl or POSIX, but the format specifically defined in 
> XMLSchema. There is no ambiguity about what form this is in. Patterns in 
> XMLSchema must use the XMLSchema regular expression format.

ok, i see what you are using. the problem i have here is that xml schema uses a 
*subset* of an externally referenced regular expression definition (Unicode 
Regular Expression Guidelines, Level 1) that does not meet the needs of a 
general regular expression mechanism (which is why xquery has provided for 
additions to the xml schema regex syntax to achieve its goals).

> Actually, "1." is not valid by the current expression, since the pattern 
>  says that a version string must end in a digit. 

ah, you are right. i stand corrected.

> As for the specifics of 
> the pattern, I don't care too much about how we form it. I used the 
> current string because it's the most common way to phrase something like 
> this in XMLSchema, therefore I believe it will be the most accessible.

for you maybe ;o) the xml schema spec provides examples and definitions that use 
the syntax i proposed (e.g. the sections on Lexical & Canonical Representations 
and the concept of patterns themselves). this is not to suggest that we must do 
this (the same spec has an example using the more advanced pattern string 
notation), but that there is no mention of a preference one way or the other.

i also maintain that the expanded numeric notation ('[0-9]+' vs '\d')works on 
ANY system that supports regular expressions. again, a preferential position 
rather than strict adherence to the specification referenced, but not without 
merit; an optimum solution should allow for either.

> The idea here is _not_ just to provide a direct match. You should look 
> at the text I added to go with this that explains exactly what this 
> pattern is used for. This is a very simple wildcard that only lets you 
> form a few patterns. Here we wouldn't want to use a full-featured regexp 
> language, since we only want people to say one of a few things:
> 
>   1.2.4
>   1.+
>   1.*.4
>   1.2.*

> which all match the string "1.2.4.". I know of no existing language we 
> could reference that only provides these limited options.

i don't understand this. the syntax you provided for version number:

  (\d+\.)*\d+

allows the following:

   1
   1.2
   1.2.3.4.5.6.7.8
   (num.num.num...)

why not rely simply upon the same regular expression reference that was used to 
construct it? what are we protecting people from? if you can only write version 
information in the form:

+
+.+
+.+
...

who cares how verbose the matching expression is (so long as it conforms to some 
common general definition)?

> The pattern strings are written in XMLSchema, and they express 
> very specific, very (intentionally) limited meaning that I provided 
> clear text to explain.

again, i suggest that this imposes restrictions that are not tangibly 
beneficial. while you may find the 'intentional' limitation comforting, it may 
require that others perform an additional operation to match this format, even 
though those systems write and read regular expressions that are 100% with the 
xml schema specification you are citing.

 > Do you have something in mind that you think would be more
 > appropriate here?

two things:

1. i would prefer that we directly reference the regular expression semantics 
that the current version of xml schema *refers* to. this would add a reference 
into our spec:

Unicode Regular Expression Guidelines
   Mark Davis. Unicode Regular Expression Guidelines, 1988. Available at:
   http://www.unicode.org/unicode/reports/tr18/

OR

reference the xquery regular expression [xml schema] superset so as to be 
consistent with our functions:

Regular Expression Syntax
   W3. XQuery 1.0 and XPath 2.0 Functions and Operators
   http://www.w3.org/TR/xpath-functions/#regex-syntax

either would require a simple statement on the use of regular expressions in 
*general* (i will write it if the idea is acceptable). given the high level of 
dependency upon xquery in our spec, perhaps the latter, while limited--and fluid 
based on its dependence upon xml schema, which itself depends upon the reference 
above--would offer the better solution.

2. remove restrictions on the syntax of the matching patterns for version 
comparisons. the definition of the version number provides a consistent numeric 
structure and how an implementer decides to check that it matches should only be 
constrained by the format of the data itself (and the use of explicit regular 
expression semantics).

i suspect that we will be seeing the use of regular expressions more frequently 
as the spec matures and i believe that version will likely end up setting a 
precedent for how this is done in the future. i would prefer that we not get in 
the habit of further subsetting the syntax of the matching patterns for reasons 
of flexibility and effort.

 > Since no one commented when I made the original suggestion,
 > I assumed that people approved.

sorry, review periods are like that :o)

b


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]