OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xacml message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: AW: [xacml] Re: XACML's limitations in the access control for XML documents use case - AW: AW: [xacml] CD-1 issue #11: strictness of xpath definition


Hi Erik, Rich, all

below you find a summary of the discussion on "xpath vs. reg-exp-match for
resource-id evaluations" so far (from my point of view) and some further
comments. I hope that helps to clarify where we are.

Baseline:
An individual decision request's resource-id attribute is an XPath
expression that points to exactly one node in the xml resource (i.e. a
multiple and hierarchically organised resource)

Open Question:
How two define the part of a rule that matches this resource-id attribute.

Possible approaches:
Approach 1: 
Use the XPath-node-XXX functions 
(e.g. xpath-node-equal(resource-id, /objects/book)

Approach 2: 
Use the reg-exp-string function
(e.g. reg-exp-string-match(resource-id, /objects\[\d+\]/book\[\d+\]))

Evaluation:
- both approaches can offer the same functionality

I hope so far we all agree. 

Now let's come to the pros and cons of each approach:

Let's start with Approach 2 (i.e. the reg-exp-string function based option):
 
A reason why the reg-expr-match approach could be better:

It is very simple and most importantly it can be evaluated very fast and
just on the two arguments - the string representing one node and a
corresponding regular expression. There is no need to do go through the xml
resource itself while evaluation.
In contrast, if you use the XPath approach the two XPath arguments of the
xpath node-match functions have to be evaluated against the xml resource
(e.g. a DOM representation in memory). This disadvantage becomes even more
serious if you take into account, that your multiple resource (e.g. an xml
doc) consists of n individual resources/nodes and thus you have to evaluate
n xpath node-match functions. In short a processing overhead that occurs
when using the XPath approach occurs n times.

On the other side the disadvantage of the reg-expr-match approach is that it
doesn't deal with the namespace definition behind the prefixes explicitly.
Thus, in case you have the following situation something unintended could
occur:

- Two xml schemas, that describe the xml resource, use the same
elementNodeName for the complexTypes (e.g. <book>).
- Further the schema authors use exactly the same namespace prefixes but the
namespaces bound to them are different (e.g. <foo:book> where foo is defined
xmlns:foo="http://www.AAA.org"; vs. xmlns:foo="http://www.BBB.org";)

Having this situation a rule defined for a <foo:book> element (more
precisely: for <http://www.AAA.org:book> elements) will also and wrongly
match for <http://www.AAA.org:book> elements.

My conclusion for approach 2:

Pros:
- easy and very performant and if the situation that can causes the
erroneous behaviour can be excluded, this approach is a very good way to go.

In case the situation (which is from my point of view more unlikely), that
is responsible for the error can't be excluded the following
"fixes/solutions" could be applied:

a) the resource-id Match part of rules/polcies must be defined the following
way: reg-exp-string-match(resource-id,
/objects/http://www.AAA.org:book\[\d+\]))  
Thus the PDP/PEP has to substitute the prefixes by the namespace definitions
bound to them explicitly when generating the resource-id attributes for the
individual decision requests

b) you use a special reg-exp-On-XPath-strings function that is namespace
aware and substitutes the prefixes correspondingly before doing the common
reg-expr. matching stuff.

c) if you neither want to follow a or b than the XPath-node- match function
based approach is your last alternative.

"Cons":
- to ensure interoperability a normal-form plus standardised guidelines how
to deal with the namespace problem have to be specified.


Now lets analyse approach 1 (the Xpath based approach)

Advantages:
- the namespace problem is avoided
- no "normal-form" for the individual resource-id attribute values has to be
defined

Disadvantages:
- a PDP implementation could face performance problems. 


Below some comments on the reasons you mentioned in your last mail, why an
XPath approach could be better.
> 
> 1. XPath contains many functions and other capabilities, which might not
> be as easily available in the URI based approach.

I don't see why this is an issue in the resource-id use case which is the
heard of the multiple resource-profile. From my point of view the only
problem you have here is how to define the part of a rule that matches the
resource-id attribute. Maybe an example might help to understand your point.


> 
> 2. The TC would avoid the effort to define the URI approach. We would
> need to improve the xpath approach instead, but I suspect that the
> effort is smaller since we can reuse so much from xpath, compared with a
> wholly new URI based approach.

What do you mean by a wholly new URI based approach. I think we just have to
define how the resource-id values have to look like and agree on a solution
how to treat the namespace problem (see a, and b, above for first ideas).
Further, do you have already any rough idea how to improve the xpath
approach?

> 
> 3. It is likely that an XML resource is already available in XML form,
> so an xpath implimentation can be applied to it directly, while the URI
> approach requires a transformation, which could degrade performance.

The transformation you mention is actually going through the xml tree and
generating a "URIs", i.e xpath-expressions where each matches exactly one
node). When the requests are generated, finding the matching rules is
reduced to the evaluation of reg-exp-string-match function calls.

What happens in the XPath case when you derive the individual decision
requests and try to find the matching rules?
According to the multiple resource profile of XACML 2.0 the resource-id
attributes of the individual requests have to identify exactly one node.
Thus you will have to generate the "URI" (i.e xpath-expressions that match
exactly one node) too. Aditionally when finding the matching rules all the
(i would argue... more expensive) node-match function calls have to be done.


As I have not spend much thoughts in how an xpath based implementation of
the multiple resource profile could be optimized, it might be very helpful
if somebody more knowledgeable in this area could comment on this and
provide some details and insides. 

> 
> Note that it is not true that the whole XML has to be repeated for each
> resource since multiple <Attributes> elements are not required with the
> xpath approach, and with XACML 3.0 it is possible to reuse the same
> <Content> document for all the multiple queries.
>

Here, I am not sure what you mean. Are you mentioning that it is not
necessary to explicitly generate valid individual decision requests where
each decision request has to contain the <content> element with the resource
under it?


Conclusion:
- we should try to collect as many pros and cons as possible from different
point of views
- given the arguments mentioned so far, I would propose to provide both
options and alter the multiple resource profile correspondingly
- some implementers' insides on the xpath based implementation of the
multiple resource profile might be very helpful to compare the related
performance issues.


Best regards Jan

PS: I like Pauls wiki proposal as I can understand that a more topic
oriented documentation might be helpful. If set up I am happy to help to
integrate some of the current discussion threads.

@ Rich: I will reply to your related mails in a separate mail as this one
became already pretty long.



> -----Ursprüngliche Nachricht-----
> Von: Erik Rissanen [mailto:erik@axiomatics.com]
> Gesendet: Montag, 28. September 2009 09:47
> An: XACML TC
> Betreff: [xacml] Re: XACML's limitations in the access control for XML
> documents use case - AW: AW: [xacml] CD-1 issue #11: strictness of xpath
> definition
> 
> Hi Rich,
> 
> Some of the reasons why an XPath approach could be better are:
> 
> 1. XPath contains many functions and other capabilities, which might not
> be as easily available in the URI based approach.
> 
> 2. The TC would avoid the effort to define the URI approach. We would
> need to improve the xpath approach instead, but I suspect that the
> effort is smaller since we can reuse so much from xpath, compared with a
> wholly new URI based approach.
> 
> 3. It is likely that an XML resource is already available in XML form,
> so an xpath implimentation can be applied to it directly, while the URI
> approach requires a transformation, which could degrade performance.
> 
> Note that it is not true that the whole XML has to be repeated for each
> resource since multiple <Attributes> elements are not required with the
> xpath approach, and with XACML 3.0 it is possible to reuse the same
> <Content> document for all the multiple queries.
> 
> Best regards,
> Erik
> 
> Rich.Levinson wrote:
> >
> > Hi Jan, et al,
> >
> > I have had a busy week and not been able to respond until now,
> > however, looking over all the subsequent emails to the one to which
> > this is a response (
> > http://lists.oasis-open.org/archives/xacml/200909/msg00081.html
> > ), it appears to me that there are still unresolved issues, and from
> > my perspective, there are some assertions made, with which I disagree,
> > about AttributeDesignators, which I thought my suggested URI scheme
> > would address, but apparently it either needs further explanation or I
> > am missing something that I have not yet understood. In any event I
> > would very much like to determine whether these assertions are true or
> > false in order that the TC be of a single mind when comparing the
> > capabilities of AttributeSelectors and AttributeDesignators.
> >
> >    The assertion with which I disagree is that the AttributeDesignators
> >    cannot do what the AttributeSelectors can do because the
> >    AttributeDesignators lose the hierarchical structure. My response is
> >    that if you don't throw away the hierarchical structure when
> >    creating your AttributeDesignators then this perceived problem does
> >    not exist.
> >
> > If I am wrong about this, I will accept that, however, I do not
> > believe that my approach to the AttributeDesignators has been
> > considered on its merits yet, and I will try to be totally explicit in
> > this email, and I will show how I think Jan's proposed solution can be
> > completely done using only AttributeDesignators and regexp string
> > matching.
> >
> > Having been thru some lengthy discussions earlier this year on the
> > hierarchical profile, I became quite sensitive to the node naming
> > issue, and one of the results of those earlier discussions was that if
> > hierarchical URIs are used to name nodes, that these names contain
> > within them the navigation necessary to locate the node, so that using
> > these names outside of an XML document does not lose the structural
> > relationships.
> >
> > Using James Clark's universal name syntax ("{namespace}elementname"
> > http://www.jclark.com/xml/xmlns.htm) combined with a transform to
> > replace the xml document with a list of name/value pairs (there are
> > several XML to JSON xslt transformers available free, which I expect
> > could readily be adapted to produce name/value pairs in the format
> > below), where each element and attribute is identfied by its full path
> > expressed as universal element names. For example, assuming the
> > document you gave as an example had a namespace = "foo":
> >
> > <objects xmlns:="foo">
> >  <book>
> >    <title>xxx</title>
> >    <author>Bob</author>
> >    <id>100</id>
> >    <price>30</price>
> >    <book-content>.....</book-content >
> >  <book>
> >  <book>
> >    <title>yyy</title>
> >    <author>Alice</author>
> >    <id>200</id>
> >    <price>80</price>
> >    <book-content >...</book-content >
> >  <book>
> > </objects>
> >
> > The above document would first be transformed to the following set of
> > name value pairs (ignoring whitespace):
> > /{foo}objects = ""
> > /{foo}objects/{foo}book[1] = ""
> > /{foo}objects/{foo}book[1]/{foo}title = "xxx"
> > /{foo}objects/{foo}book[1]/{foo}author = "Bob"
> > /{foo}objects/{foo}book[1]/{foo}id = "100"
> > /{foo}objects/{foo}book[1]/{foo}price = "30"
> > /{foo}objects/{foo}book[1]/{foo}content = "..."
> > /{foo}objects/{foo}book[2] = ""
> > /{foo}objects/{foo}book[2]/{foo}title = "yyy"
> > /{foo}objects/{foo}book[2]/{foo}author = "Alice"
> > /{foo}objects/{foo}book[2]/{foo}id = "200"
> > /{foo}objects/{foo}book[2]/{foo}price = "80"
> > /{foo}objects/{foo}book[2]/{foo}content = "..."
> >
> > The next step is to define resources, which for this use case would be
> > done based on multiple resource profile, where we would have 2
> > resources, using Erik's shorthand:
> > <Resource>resource-id=/{foo}objects/{foo}book[1]</Resource>
> > <Resource>resource-id=/{foo}objects/{foo}book[2]</Resource>
> >
> > The next step is to create xacml attributes for these resources using
> > the full universal names as AttributeIds (again w some shorthand),
> > resulting in the following 2 requests:
> > (Note: since AttributeId requires anyURI datatype, the following
> > percent-encoding must be applied to the AttributeId values:
> >
> >    * { -> %7B
> >    * } -> %7D
> >    * [ -> %5B
> >    * ] -> %5D )
> >
> >
> > <Request>
> > <Subject>subject-id="Bob"</Subject>
> > <Resource>
> > <Attribute>AttributeId="resource-id"
> > value="/{foo}objects/{foo}book[1]"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B1%5D/%7Bfoo%7Dti
> tle"
> > value = "xxx"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B1%5D/%7Bfoo%7Dau
> thor"
> > value = "Bob"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B1%5D/%7Bfoo%7Did
> "
> > value = "100"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B1%5D/%7Bfoo%7Dpr
> ice"
> > value = "30"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B1%5D/%7Bfoo%7Dco
> ntent"
> > value = "..."</Attribute>
> > </Resource>
> > </Request>
> >
> > <Request>
> > <Subject>subject-id="Bob"</Subject>
> > <Resource>
> > <Attribute>AttributeId="resource-id"
> > value="/{foo}objects/{foo}book[2]"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B2%5D/%7Bfoo%7Dti
> tle"
> > value = "yyy"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B2%5D/%7Bfoo%7Dau
> thor"
> > value = "Alice"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B2%5D/%7Bfoo%7Did
> "
> > value = "200"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B2%5D/%7Bfoo%7Dpr
> ice"
> > value = "80"</Attribute>
> >
> <Attribute>AttributeId="/%7Bfoo%7Dobjects/%7Bfoo%7Dbook%5B2%5D/%7Bfoo%7Dco
> ntent"
> > value = "..."</Attribute>
> > </Resource>
> > </Request>
> >
> > All the above processing to create the requests is done in the
> > ContextHandler, then the requests are submitted one at a time to the
> PDP.
> > Now the rule that gets applied to each of these requests is the
> > following:
> >
> > <Rule effect=Deny>
> >  Target:
> >    reg-exp-string-match(resource-id, /{foo}objects/{foo}book\[\d+\]
> >  Condition:
> >    AttributeDesignator(AttributeId =
> > function:string-concatenate(resource-id, /%7Bfoo%7Dprice) > 50 and
> >    AttributeDesignator(AttributeId =
> > function:string-concatenate(resource-id, /%7Bfoo%7Dauthor) =
> > AttributeDesignator(subject-id)
> > </Rule>
> >
> > Unless I am mistaken, all the logic and structure is retained and it
> > has been done purely w AttributeDesignators and regexp.
> >
> > Assuming the above is correct, then the points I made about the
> > advantages over XPath (for an enterprise looking to use only URIs to
> > identify attributes):
> >
> >   1. The XML document does not need to be passed in with the request.
> >      There is no node collection, only string operations.
> >   2. For very large XML documents, say a catalog of 10,000 books, each
> >      book is processed individually independent of the other books, as
> >      compared to the XPath case, where one might expect the whole
> >      document has to get parsed for each of the 10,000 individual
> > requests.
> >   3. There is no paradigm shifting, or what I believe was referred to
> >      in the discussion as "shifting semantics between XPath and XACML
> >      in terms of representing the policies.
> >
> > Again, assuming the above is correct, I am not assuming this will be
> > desirable for everyone, however there may very well be organizations
> > for whom the advantages of this approach are decisive.
> >
> > A couple other points are that
> >
> >    * the "unsightliness" of the AttributeIds and the
> >      AttributeDesignators can be "covered" up by policy tools that
> >      facilitate defining policies based on XML Schemas, and can keep
> >      all the encoding details transparent to the policy designers.
> >    * the issue about basing policies on the structure of XML documents
> >      is a legitimate concern, however, if structure of documents
> >      change, then a legitimate case could probably made that the
> >      namespace associated with that structure should also change, which
> >      would mean the policy tools would need to be able to facilitate
> >      upgrading of policies to new namespaces based on new revs of the
> >      schemas.
> >
> > Comments and suggestions welcome.
> >
> >    Thanks,
> >    Rich
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]