OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xdi message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: Full Graph Model ABNF Test


Great progress, but let me ruin your Sunday too :)

Another problem (and that seems to be true for pretty much all the ABNFs we had so far) is that the following:
=markus$!(+name) 
could be parsed in two different ways.

It could be parsed as the two subsegments =markus, and $!(+name).
But it could also be parsed as the three subsegments =markus, $, and !(+name).

Like I said, this is also true for the old segment-based ABNF, but there it hasn't really caused any trouble.

Now however with your latest Full Graph Model ABNF, the APG parser seems to take the three subsegment approach, so when I feed it
=markus$!(+name) 
then it parses that as three non-attr-singleton's, rather than one non-attr-singleton and one attribute-singleton.

So, unfortunately, the parser can't find a solution to satisfy literal-context, and parsing fails again.

I'm right now experimenting with writing a completely non ABNF based parser, completely "manual", kind of what Joseph has suggested at some point.

Markus

On Sun, Feb 24, 2013 at 3:18 AM, Drummond Reed <drummond@connect.me> wrote:
Markus, you ruined my Saturday afternoon ;-)  See inline.


On Sat, Feb 23, 2013 at 9:44 AM, Markus Sabadello <markus.sabadello@xdi.org> wrote:
So I tried to use the APG parser generator with the proposed Full Graph Model ABNF on a few sample XDI statements.

For example, I tried to run it on this:
=markus$!(+name)/!/(data:,MarkusSabadello)

... and I ran into 2 problems.

First problem:

According to the ABNF, $!(+name) is a valid entity-singleton.
This doesn't have a big effect on the overall parsing process, but it's not correct.

I can see why the parser thinks that, because it does actually satisfy the rules for an entity singleton the way they are written.

To fix it required changing the specific rule. I just did that on the ABNF page - and at the same time added another feature that we had forgotten to call out in the ABNF, which is an explicit rule for the semantics of ordering within collections, i.e., the $*DIGIT pattern. The changed rules are:

specific             = "$" [ xdi-literal ]
member               = entity-member / attribute-member / order-ref
entity-member        = "$" member-xref
attribute-member     = "$" attribute-instance
order-ref            = "$" order
attribute-instance   = "!" member-xref
member-xref          = "(" immutable ")"
order                = "*" 1*DIGIT
 
With these changes I now think the parser has no choice with $!(+name) but to determine it is an attribute singleton. Plus we get a clean semantic way to specify ordering within collections.


Second problem:

The parser doesn't actually recognize the whole string as a valid XDI statement.

Why? Let's look at the ABNF rule that has to be satisfied:
literal = [ context ] literal-context "/!/" data-xref
Now what happens is that the parser thinks that the context part of this is matched by =markus$!(+name), which leaves nothing to match the literal-context, therefore, parsing fails.

Instead, what should happen is that the context part is matched by only =markus, and that the literal-context part is matched by $!(+name).

But for some reason the parser isn't able to figure that out.

I know this has something to do with the parsing algorithm, with left recursion, and with backtracking, but I'd have to read up on these concepts to fully understand the problem.

I don't think this one is left recursion -- as Joseph says, it's just standard "greedy algorithm", which is that the parser will match everything to the [ context ] rule.

What this means is that for the rules around collections/members and literal-contexts/literals, we can't take the shortcut of just prefixing the last context type (e.g., literal-context in your example above) with an optional [ context ].

Instead we have to actually specify the pattern that must be matched. This is harder than specifying the starting context type in a sequence, but it comes down to this pattern:
terminal-pattern     = *( terminal-context-type 1*non-terminal-context-type ) 1*terminal-context-type
This pattern forces the sequence to end in a specific context type, but allows it to be proceeded by other different context types. So here's the revised rules for literal and literal-context:
literal                 = literal-context "/!/" data-xref
literal-context         = attr-singleton-context / attr-member-context
attr-singleton-context  = [ peer ] *( [ attribute-singleton ] 1*non-attr-singleton ) 1*attribute-singleton
non-attr-singleton      = entity-singleton / ( collection [ member ] )
attr-member-context     = [ peer ] *( [ collection attribute-member ] 1*non-attr-member ) 1*( collection attribute-member )
non-attr-member         = singleton / ( collection [ entity-member ] )
Now, if you test =markus$!(+name) against this,  =markus will match the entity-singleton rule within non-attr-singleton, and $!(+name) will match the 1*attribute-singleton rule at the end of attr-singleton-context. So it will be recognized as a literal-context.

I have updated all of this on the ABNF page of the wiki. Please do continue testing it - let's nail down any other bugs in it ASAP.

=Drummond  

 

I also found this, very interesting:

Markus








[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]