[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Issue #11 again
All, Jan is correct that there is a potential performance problem with XPath vs regexp. As he has demonstrated, it doesn't have to do with the performance of a regexp implementation vs XPath implementation. It has to do with the manner in which the expressions are used. With the xpath based approach this is what happens (at least in a naïve implementation): 1. The policy contains an xpath expression and uses an xpath matching function. 2. The PDP receives the multiple request with an xpath and expands this into N resources. 3. Each resource is evaluated against the policy. To do this the xpath matching function will evaluate two xpath expressions. The individual resource id, which will select a single node, and the xpath expression from the policy, which will typicall evaluate to multiple nodes, let's say there are R of them. It will then check if the individual node is in among the R nodes. With a regexp matching function this will happen (I use the term "xpath expression equivalent" here since it is still an open question which language we will use here): 1. The policy contains an "xpath expresssion equivalent" and uses a regexp matching function 2. The PDP receives the multiple request with an "xpath equivalent" and expands this into N resources. 3. Each resource is evaluated against the policy. To do this the regpexp matching function will work directly on the expression of the individual resource id. Notice the difference in step three. In the second case there is no requirement that the actual R nodes are enumerated. I don't know what would be possible to do in the first case. A naïve implementation would evaluate the xpath from the policy into a result set of R nodes, and then check whether the individual resource-id node is among these nodes. That would require iterating over up to R nodes, and is likely less efficient than the regexp approach. Now, maybe it would be possible to run the xpath evaluation "backwards"? That is, given a node in an XML tree, can we ask whether a given xpath expression would select that node, without having to evaluate the full set of nodes? I don't know. If it would be possible, then an implementation of the xpath approach can have similar performance as the regexp approach. I think we should investigate this. If it is not possible, I agree that some other approach should be available. In that case I think the most fruitful approach is a defined form of xpath expressions for the individual resources. I think we can work around the namespace issue since similar notation as to the Clarke notation can be provided with XPath predicates. Or one can use a notation like "*[2]/*[1]", etc, which also does not have any namespace issues. I would prefer a strict xpath to an entirely new scheme. With an xpath based approach people can use either xpath or regexp functions in their policies, with the same underlaying "expansion" scheme for the multiple request. Best regards, Erik
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]