OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xacml message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Issue #11 again


All,

Jan is correct that there is a potential performance problem with XPath 
vs regexp. As he has demonstrated, it doesn't have to do with the 
performance of a regexp implementation vs XPath implementation. It has 
to do with the manner in which the expressions are used.

With the xpath based approach this is what happens (at least in a na´ve 
implementation):

1. The policy contains an xpath expression and uses an xpath matching 
function.

2. The PDP receives the multiple request with an xpath and expands this 
into N resources.

3. Each resource is evaluated against the policy. To do this the xpath 
matching function will evaluate two xpath expressions. The individual 
resource id, which will select a single node, and the xpath expression 
from the policy, which will typicall evaluate to multiple nodes, let's 
say there are R of them. It will then check if the individual node is in 
among the R nodes.


With a regexp matching function this will happen (I use the term "xpath 
expression equivalent" here since it is still an open question which 
language we will use here):

1. The policy contains an "xpath expresssion equivalent" and uses a 
regexp matching function

2. The PDP receives the multiple request with an "xpath equivalent" and 
expands this into N resources.

3. Each resource is evaluated against the policy. To do this the regpexp 
matching function will work directly on the expression of the individual 
resource id.


Notice the difference in step three. In the second case there is no 
requirement that the actual R nodes are enumerated.

I don't know what would be possible to do in the first case. A na´ve 
implementation would evaluate the xpath from the policy into a result 
set of R nodes, and then check whether the individual resource-id node 
is among these nodes. That would require iterating over up to R nodes, 
and is likely less efficient than the regexp approach.

Now, maybe it would be possible to run the xpath evaluation "backwards"? 
That is, given a node in an XML tree, can we ask whether a given xpath 
expression would select that node, without having to evaluate the full 
set of nodes? I don't know. If it would be possible, then an 
implementation of the xpath approach can have similar performance as the 
regexp approach.

I think we should investigate this. If it is not possible, I agree that 
some other approach should be available. In that case I think the most 
fruitful approach is a defined form of xpath expressions for the 
individual resources. I think we can work around the namespace issue 
since similar notation as to the Clarke notation can be provided with 
XPath predicates. Or one can use a notation like "*[2]/*[1]", etc, which 
also does not have any namespace issues.

I would prefer a strict xpath to an entirely new scheme. With an xpath 
based approach people can use either xpath or regexp functions in their 
policies, with the same underlaying "expansion" scheme for the multiple 
request.

Best regards,
Erik



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]