OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xacml message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xacml] URI-match function proposal

Bill - You have demonstrated amply that regex can express all the variations
one could possibly want to express in a URI sub-tree match.

The concern I had was that regex allows one to express much more than one
would ever want to express in practice.  For instance, one could insert any
number of wild-card sequences (beginning, middle, end, any number in the
middle, any number in the middle with one at the beginning and/or one at the
end).  With the exception of the first three possibilities, the others are
of questionable practical interest.  Yet, they would be legal expressions,
and someone might use them.  

In most circumstances, there may be no adverse consequence.  But, for my
use-case, one would have to create SQL queries from such expressions that
would retrieve policies whose targets were either more general or more
specific.  It isn't clear to me that this is practical in general.

Perhaps, the approach we should take is to relegate this use-case to another
profile; one that might place restrictions on the regex expression (for
instance, only allowing one wild-card sequence in an expression).  In the
meantime, the core spec can define uri-subtree-match to use regex.

All the best.  Tim.

-----Original Message-----
From: Bill Parducci [mailto:bparducci@gluecode.com] 
Sent: Monday, July 12, 2004 4:37 PM
To: 'xacml'
Subject: Re: [xacml] URI-match function proposal

Tim Moses wrote:

> 1. There seems to be nothing to distinguish your two <ResourceMatch> 
> elements, although they are to be treated entirely differently.  How 
> will the processor know that the first match is for the host and the 
> second one is for the resource?  Shouldn't they have different 
> <ResourceAttributeDesignator> elements?  Also, the fist match could 
> include the initial portion of the local path, could it not?

first, let me apologize for the sloppiness of the xml, i just tossed it out 
there to prop up the regex. unique <ResourceAttributeDesignator> elements
do it i guess. i will defer to the xml guratti of the TC for guidance here

second, the first match is intended to represent only the host because
like port number notation would cause problems (for optimization if nothing 
else, there are a lot of ports out there) and is much more easily handled by
second (regex) match, as i will attempt to demonstrate below.

> 2. Remember, in my use-case, I need to create multiple queries for 
> target patterns that are "more general" than the PDP's topic.  For 
> example, if the "topic" is "www.example.com/resources" then I need to 
> generate several
> queries:
> one that exactly matches "www.example.com/resources",
> one that exactly matches "www.example.com/",
> one that exactly matches "example.com/" and
> one that exactly matches ".com/".

> Otherwise, I won't find policies whose scopes "include" the topic.  Or 
> were you proposing a different approach?  It seems to me that this 
> would be rather difficult if the full freedom of reg-exp is allowed.

hey, it you left out:

one that exactly matches "http://";

...and to the earlier point on the list:

  one that exactly matches "www.example.com:80/resources",
  one that exactly matches "www.example.com:80/",
  one that exactly matches "example.com:80/",
  one that exactly matches ".com:80/" and
  one that exactly matches ":80/".

anyway, let me work this backwards from how it was asked because i think it 
might make more sense.

 > In addition, I have to locate policies that apply to a "more specific"  >
scope, such as "www.example.com/resources/finance"

this is the easy one, unless you specify "$" (end of input) regex will match
strings that meet the expression. for example: "abcd" matches "abcd",
and "abcdef".

however, "abcd$" only matches "abcd" and not "abcde" nor "abcdef", so you
decide to not include "more specific" policies if that is what you feel
FWIW the same holds true for "^", but applied to the beginning of the
"^abc" matches "abcdef" but not "defabc". (this is why i think we want to
beyond the use of regular expressions in XMLSchema to include {at least}
two operators--which is what i believe XPath/XQuery did).

so the expression used in the previous note will match all "children" of the

given host/resource, transforming your queries above as regex expressions
(which can be used by something like REG_EXP in oracle):

(matches http://www.example.com/resources[ANYTHING] or with port 80

(matches http://www.example.com[ANYTHING] or with port 80 variation)

(matches http://example.com{ANYTHING]  or with port 80 variation)

(matches http://[ANYTHING] or with port 80 variation)

(matches [ANYTHING].com or with port 80 variation)

in other words, you are performing the exact same type of lookup that you
otherwise perform with LIKE conditions in your SQL statements. however, by 
splitting the host and resource components in the matching function you have
ability to optimize how host information is handled (e.g. convert all to 
lower/upper/etc.... with the coordination of the policy validation process) 
making the the queries are more compact if that is important.

using the lower case option would shrink the necessary regex to express the 
"ugliest" expression to:  "^http:\/\/www\.example\.com(:80)?\/resources".
if you optimize your host name storage you can do some pretty cool stuff
compactly: "^(http|ftp):\/\/www\.example\.com(:80)?\/" would let you specify
discrete protocols for all resources on a host in a single request. it
take too much imagination to see you can can programatically chunk the 
components of a URL to provide compact expression generation automagically 
("ftp|http", "www|web", "foo|forbarindustriesinc", "com|net|biz|info", etc.)

regex can be ugly, verbose and cryptic, but it is also incredibly powerful,
defined, works with horribly messy things like unicode while lending itself
programatic manipulation. it is also supported by every programming language
know of (ok, assembler probably doesn't have regex expressions) when using
POSIX family of expressions. regex is your friend :-D

am i on the right track to answering your question, or am i out in the weeds


To unsubscribe from this mailing list (and be removed from the roster of the
OASIS TC), go to

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]