Re: Full Graph Model ABNF Test

On Sun, Feb 24, 2013 at 3:18 AM, Drummond Reed <drummond@connect.me> wrote:

Markus, you ruined my Saturday afternoon ;-) See inline.
On Sat, Feb 23, 2013 at 9:44 AM, Markus Sabadello <markus.sabadello@xdi.org> wrote:

So I tried to use the APG parser generator with the proposed Full Graph Model ABNF on a few sample XDI statements.

For example, I tried to run it on this:
=markus$!(+name)/!/(data:,MarkusSabadello)

... and I ran into 2 problems.

First problem:

According to the ABNF, $!(+name) is a valid entity-singleton.
This doesn't have a big effect on the overall parsing process, but it's not correct.

I can see why the parser thinks that, because it does actually satisfy the rules for an entity singleton the way they are written.

To fix it required changing the specific rule. I just did that on the ABNF page - and at the same time added another feature that we had forgotten to call out in the ABNF, which is an explicit rule for the semantics of ordering within collections, i.e., the $*DIGIT pattern. The changed rules are:
specific             = "$" [ xdi-literal ]
member               = entity-member / attribute-member / order-ref
entity-member        = "$" member-xref
attribute-member     = "$" attribute-instance
order-ref            = "$" order
attribute-instance   = "!" member-xref
member-xref          = "(" immutable ")"
order                = "*" 1*DIGIT
With these changes I now think the parser has no choice with $!(+name) but to determine it is an attribute singleton. Plus we get a clean semantic way to specify ordering within collections.
Second problem:

The parser doesn't actually recognize the whole string as a valid XDI statement.

Why? Let's look at the ABNF rule that has to be satisfied:
literal = [ context ] literal-context "/!/" data-xref
Now what happens is that the parser thinks that the context part of this is matched by =markus$!(+name), which leaves nothing to match the literal-context, therefore, parsing fails.

Instead, what should happen is that the context part is matched by only =markus, and that the literal-context part is matched by $!(+name).

But for some reason the parser isn't able to figure that out.

I know this has something to do with the parsing algorithm, with left recursion, and with backtracking, but I'd have to read up on these concepts to fully understand the problem.
I don't think this one is left recursion -- as Joseph says, it's just standard "greedy algorithm", which is that the parser will match everything to the [ context ] rule.

What this means is that for the rules around collections/members and literal-contexts/literals, we can't take the shortcut of just prefixing the last context type (e.g., literal-context in your example above) with an optional [ context ].

Instead we have to actually specify the pattern that must be matched. This is harder than specifying the starting context type in a sequence, but it comes down to this pattern:
terminal-pattern     = *( terminal-context-type 1*non-terminal-context-type ) 1*terminal-context-type
This pattern forces the sequence to end in a specific context type, but allows it to be proceeded by other different context types. So here's the revised rules for literal and literal-context:
literal                 = literal-context "/!/" data-xref
literal-context         = attr-singleton-context / attr-member-context
attr-singleton-context  = [ peer ] *( [ attribute-singleton ] 1*non-attr-singleton ) 1*attribute-singleton
non-attr-singleton      = entity-singleton / ( collection [ member ] )
attr-member-context     = [ peer ] *( [ collection attribute-member ] 1*non-attr-member ) 1*( collection attribute-member )
non-attr-member         = singleton / ( collection [ entity-member ] )
Now, if you test =markus$!(+name) against this, =markus will match the entity-singleton rule within non-attr-singleton, and $!(+name) will match the 1*attribute-singleton rule at the end of attr-singleton-context. So it will be recognized as a literal-context.
I have updated all of this on the ABNF page of the wiki. Please do continue testing it - let's nail down any other bugs in it ASAP.

=Drummond

I also found this, very interesting:
http://en.wikipedia.org/wiki/Comparison_of_parser_generators

Markus

xdi message