xacml message

Subject: Issue 11 and running XPath in reverse

From: Erik Rissanen <erik@axiomatics.com>
To: XACML TC <xacml@lists.oasis-open.org>
Date: Mon, 26 Oct 2009 10:48:25 +0100

All,

I have done some searching and thinking about issue 11 from the public
review issues list.

In particular, I have been looking into the possibility of evaluating
xpath in "reverse". See my earlier email here for what I mean by that:

http://lists.oasis-open.org/archives/xacml/200910/msg00092.html

Most XPath APIs I can find do not implement any method like this. dom4j
does have something similar, with this method:

http://www.oschina.net/uploads/doc/dom4j-1.6.1/org/dom4j/Node.html#matches(java.lang.String)

but when I looked at the source code for dom4j, it is implemented as an
enumeration of all matching nodes, so it is not of any help.

I looked at the XPath 1.0 spec myself, and by a casual look it appears
to me it would be possible to run it in reverse, starting from the last
location step, checking the predicate of that, and the moving along the
axis in the reverse direction. Repeat this until the expression is
consumed and check whether you got to the context node. For simple
expressions this should be efficient, while expressions which uses an
axis like "descendants" for instance, could require lots of searching in
the document, and become inefficient.

I haven't looked into any of the details, so there might be expressions
which cannot be reversed.

Besides running xpath in reverse, another possibility is to cache xpath
evaluation results. This would not help in the case of a single request,
but would make a big difference in the case of multiple requests (which
I think has been the main concern). Consider this process:

1. The policy contains an xpath expression and uses an xpath matching
function.

2. The PDP receives the multiple request with an xpath and expands this
into N resources.

3. The first resource is evaluated against the policy. To do this the xpath
matching function will evaluate two xpath expressions. The individual
resource id, which will select a single node, and the xpath expression
from the policy, which will typicall evaluate to multiple nodes, let's
say there are R of them. It will then check if the individual node is in
among the R nodes. The PDP saves the result of this xpath expression for
the duration of the multiple requests.

4. The remaining resources are evaluated against the policy. The xpath
expression in the policy does not need to be evaluated since the result
is saved from the first evaluation. The single resource xpath is
evaluated and the PDP checks whether the node is among the R nodes in
the saved result. This can be done very efficiently using a hash table.

With this procedure multiple requests can be evaluated very efficiently.
Given this, I propose that we do not invent new schemes or use regexp
matching instead of the XACML xpath functions. The XPath functions have
many benefits in that they are namespace aware, easy to use (avoiding
"matching on a matching language"), the XPath spec is out there
available for us to reference without any need for work on our part and
XPath implementations are readily available.

There is also a possibility for innovative XACML products go beyond the
off-the-shelf XPath implementations and run some of the expressions in
reverse if the PDP think it would improve performance.

Best regards,
Erik

Follow-Ups:
- Re: [xacml] Issue 11 and running XPath in reverse
  - From: "Rich.Levinson" <rich.levinson@oracle.com>