OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

relax-ng message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Re: Explosion

I know you want to restrict <oneOrMore> patterns that contain attributes and
there may be good arguments for this, but the argument you are advancing
here is not a good argument.

1. Memoization is fundamental to JTREX.  JTREX does extensive memoization
already for elements and character data.  It would have terrible performance
if it didn't.  The reason that JTREX's performance with attributes is worse
than its performance with elements is that it does memoization for elements
(in most cases) but doesn't do any for attributes.

2. The exponential behaviour you have observed with attributes arises even
without oneOrMore.  You can get it with a large group just as easily.

3.  In the cases you suggested there was in fact no explosion in the number
of states (distinct patterns); the increase in the number of states was in
fact linear in the number of attributes. The cases that are really
problematic are those that do cause an explosion in the number of states and
hence cannot be solved by memoization.

----- Original Message -----
From: "Murata Makoto" <mura034@attglobal.net>
To: <trex@lists.oasis-open.org>
Sent: Monday, May 07, 2001 8:29 PM
Subject: Re: Explosion

> James Clark wrote:
> > Incidentally, this has nothing to do with attribute wildcards.   You
> > equally well cause blow-up in the current JTREX implementation with a
> > sufficiently large choice of attributes.
> In my understanding, JTREX explodes when:
> 1) <oneOrMore> directly or indiretly contain <group> or <interleave>
> patterns.
> 2) at least one of them directly or indirectly contains multiple
> subpatterns with which many attributes of a single element match.  That
> subpattern may be an attribute wildcard or large choice of attributes.
> For each of the many attributes of the element, we can consume one of the
> multiple subpatterns.  If we have many attributes, combinatory explosion
> will usually happen.
> > This is just a limitation of JTREX.  It doesn't memoize the construction
> > derivatives for attributes.  Just to prove to myself that this really
> > the problem, I implemented a simple caching scheme for attribute
> > derivitives; with this fix, it could handle your cases with any
> > increase in computation time.
> By memorizing derivatives, different branches will merge and thus
> combinatory explosion can be avoided.
> Cool, but too cool.  I do not see any advantages in allowing
> patterns I described above.
> Cheers,
> Makoto

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC