OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xdi message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [xdi] Re: Further ABNF adjustment


So I think this means that the only change to the last ABNF I posted on the discussion page is to remove the colon from xdi-pchar.

And we would add
- A note that in an IRI, parens must be percent encoded. I think when implementing this, internally I would "cheat" by simply removing the parens from the IRI sub-delims rule.
- The xdi-text and xdi-json rules. But I think they should be separate and go on an informative page rather than the core ABNF proposal.

Markus

On Wed, Feb 6, 2013 at 8:18 AM, Drummond Reed <drummond@connect.me> wrote:
Inline.

On Tue, Feb 5, 2013 at 2:41 PM, Markus Sabadello <markus.sabadello@gmail.com> wrote:
I think I agree we don't need a leading delimiter for IRIs.

I'm now leaning this way too, because distinguishing between a bare literal and an IRI within a cross-reference just comes down to the presence of the colon in the string (which is why colons should be reserved and not allowed in xdi-pchar).
 

I'm now totally against any "bare literals", except in an xref.

Agreed.
 

I also agree the xdi-text and xdi-json could be useful, but I don't how this is relevant to the other ABNF questions at hand.

They are not relevant to any other ABNF questions in hand - only to the final ABNF when it comes to the practical real-world question of how XDI addresses/JSON strings should be embedded in running text.
 

I think in IRIs, parens should be percent-encoded to avoid confusion with XDI xrefs.

I agree. Let's simply require percent-encoding of parens in IRIs and be done with it -- all parens in XDI addresses will be XDI cross-references.

=Drummond 
 



On Tue, Feb 5, 2013 at 12:30 AM, Drummond Reed <drummond@connect.me> wrote:
Joseph, this is really cool!

It also highlights the important decision we have to make about bare literals. Seeing it put this way makes it even clearer how streamlined the ABNF can be if we allow bare literals.

Over the weekend I also had an idea about how to deal with the bare literal problem in the first segment of an XDI address when it exists in the wild (which may be an edge case, but still one we need to deal with).

The idea is for the xdi-text rule that I propose in https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion (which is basically to enclose XDI addresses or XDI JSON documents that appear running text insidesquare brackets to make them easy to recognize and parse) to use an additional forward slash to prefix the first segment of an XDI address. So instead of an XDI text block that contains an XDI statement consisting of all bare literals looking like:

  [abc/def/xyz]

....it would look like this:

  [/abc/def/xyz]

The reason I particularly like this is that now an XDI text block in any running text or markup document would be recognizable using just two rules:
  1. An embedded XDI address would always start with [/
  2. An embedded XDI JSON document would always start with [{
Examples of the first rule:
[/=drummond]
[/=drummond/+friend/(http://xdi.org/user/markus)]
Example of the second rule:
[{"=drummond/+friend":["(:http://xdi.org/user/markus)"]}]
If this approach can solve the problem of bare literals being allowed at the start of a first segment, then the question is: should we allow bare literals at the beginning of any segment in order to support this very streamlined ABNF parsing?

The second question is: should we stay with the current approach of just delimiting an IRI inside a cross-reference by looking for the colon following the scheme name (which is required by IRI syntax), or should we require a leading delimiter? My gut is the same as Joseph's here, which is that it is okay to parse for the colon following the IRI scheme name. Even though this is a little bit slower than just looking for a leading colon, it is simpler because it only requires "wrapping" the URI in parentheses.

How do others feel about these two questions?

=Drummond 


On Mon, Feb 4, 2013 at 11:01 AM, Joseph Boyle <planetwork@josephboyle.net> wrote:
Now not up to date, but for reference, I made these the other day with http://railroad.my28msec.com/rr/ui :







The EBNF input was: 
address ::= subseg+ ('/' subseg+ ('/' subseg+)?)?
subseg  ::= [=@+$] [*!] (xref | literal)
xref    ::= '(' (IRI | address) ')'
literal ::= (iunreserved | pct-encoded | [&;,':])+


On Feb 3, 2013, at 6:58 PM, Joseph Boyle <planetwork@josephboyle.net> wrote:

Drummond,

You are correct, excluding a specific trivial case can actually force more complexity in rules. The old ABNF had some examples of this. This is one reason why allowing a bare literal as a segment seems more natural to me.

The xref rule with added initial colon might need more grouping brackets:  "(" [ [ ":" IRI ] / address ] ")"

I actually think allowing simply (http:// … ) with its own noninitial colon as an IRI xref would only add a little (finite) complexity to parsing, as opposed to some of the exponentially growing parse trees we may have been hitting in the past, and would look good for XDI's first-class support of IRIs - I posted a comment on this to https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion earlier today.

Also noted the tel: and sms: schemes can have matched parentheses in their bodies, so if we allow these we may have to allow matched parentheses in IRIs, and do parenthetical depth counting as we parse IRIs, unless we require clients to escape and unescape all the internal parens. If we're scanning for parens, checking for the internal colon after the scheme is not much additional work.

Joseph

On Feb 3, 2013, at 2:33 PM, Drummond Reed <drummond@connect.me> wrote:

Joseph,

First, thanks very much for this analysis of the ABNF. I hadn't appreciated it in detail until I studied after Friday's telecon. Condensing it down to four lines is a FANTASTIC way of seeing the essence of the ABNF.

Based on our discussion on Friday's call, and if you follow the recommendations I posted to https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion (namely, not allowing colons in literals, and using colons to prefix IRIs within cross-references), here's a revised version of your four-line ABNF if bare literals are allowed to begin segments:

OPTION #1: IF BARE LITERALS ARE ALLOWED

address = 1*subseg [ "/" 1*subseg [ "/" 1*subseg ] ] ;
subseg  = [ "=" / "@" / "+" / "$" ] [ "*" / "!" ] [ xref / literal ] ;
xref    = "(" [ ":" IRI / address ] ")";
literal = 1*[ iunreserved / pct-encoded ] ;

If bare literals are NOT allowed, as in the proposal we discussed on Friday, then I could only condense the ABNF into six rules

OPTION #2: IF BARE LITERALS ARE NOT ALLOWED

address = 1*subseg [ "/" 1*subseg [ "/" 1*subseg ] ]
subseg  = global / local / xref
global  = ( "=" / "@" / "+" / "$" ) [ "*" / "!" ] [ xref / literal ]
local   = ( "*" / "!" ) [ xref / literal ]
xref    = "(" [ ":" IRI / address / literal ] ")"
literal = 1*[ iunreserved / pct-encoded ]

Two questions:
  1. Am I missing something - do you see a way to compact it further?
  2. Will there be any real difference in efficiency of parsing between these two (given that Option #2 is actually narrower than Option #1 because it excludes bare literals)?
Thanks,

=Drummond  



On Fri, Feb 1, 2013 at 9:34 AM, Joseph Boyle <planetwork@josephboyle.net> wrote:
Markus, thanks for the recognition, glad to be able to help out.

Drummond, do we need to exclude bare literals as segments at the syntax level? It seems to me they may be semantically trivial, but are syntactically consistent.

Just experimenting with finding a minimal set of verification rules (for clarity, omitting naming all the productions we want as parsing results) if bare literals are allowed, the grammar can be as short as:

address = 1*subseg [ "/" 1*subseg [ "/" 1*subseg ] ] ;
subseg  = [ "=" / "@" / "+" / "$" ] [ "*" / "!" ] [ xref / literal ] ;
xref    = "(" [ IRI / address ] ")";
literal = 1*[ iunreserved / pct-encoded / "&" / ";" / "," / "'" / ":" ] ;


On Jan 31, 2013, at 11:30 PM, Drummond Reed <drummond@connect.me> wrote:

Markus, thanks, this is great work. I have reviewed this and am in agreement with the changes. 

The support for a literal as a standalone value at the start of a XDI segment has always been somewhat theoretical, i.e., we originally did it that way to not rule it out (because the preceeding slash could be a delimiter). But that does not work for the first segment of an XDI address.

So I agree that it's cleaner to just require all XDI segments to start with delimited subsegments. 

I'll add this to the agenda for tomorrow's telecon.

=Drummond 


On Thu, Jan 31, 2013 at 5:51 PM, Markus Sabadello <markus.sabadello@xdi.org> wrote:
Hello XDI TC,

Based on implementation experience and some discussions, I added another slightly changed version of the XDI ABNF to the discussion page of the relevant proposal:

Here's the summary from the page:
1. Some of the changes here are motivated by the insight that the purpose of an ABNF is not only to validate a string against a set of rules, but also to semantically understand the various components of that string.
2. The "xdi-inner-graph" rule is introduced, in order to have an explicit rule for this fundamental XDI construct. This change doesn't affect what is valid XDI and what is not.
3. The "xdi-context" rule is introduced, for the same reason.
4. The "xdi-segment" rule is changed to no longer permit a literal at the beginning. A segment that does not start with a context symbol, and is not a cross-reference, does not appear to be useful, and it might be ambiguous with regard to other rules.
5. The "xref-literal" rule is introduced, in order to still allow literals in cross-references.

I tested this ABNF in the XDI2 library, and it appears to work fine.

In fact, I have recently added to XDI2 support for a new parser library (APG), in addition to the one I had been using before (aParse).
After evaluating them both, my conclusion is that they are both able to handle the XDI ABNF, that they produce the same results, and that APG is about twice as fast as aParse.
So APG will now be standard in XDI2, but aParse is optionally also still supported.

I have spent quite some time thinking about Joseph Boyle's ideas about optimizing the parsing process in smart ways, for example by simply "skipping" from an opening "(" to a closing ")" in order to avoid having to descend deep into the IRI rules. This sounds quite good to me, I just haven't found a way to actually implement that yet in a way that still ensures robustness and correctness of the parsing process. I think it was also Joseph who early on suggested that XRI parsing might be one of the most resource-intensive tasks of an XDI server, and I think that is very right. So while switching to a faster parsing library is a great step, we'll keep looking for further optimizations.

You can use the following tool to experiment with the most recent ABNF proposal I mentioned above:

Markus


--
You received this message because you are subscribed to the Google Groups "XDI2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdi2+unsubscribe@googlegroups.com.
To post to this group, send email to xdi2@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 






--
You received this message because you are subscribed to the Google Groups "XDI2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdi2+unsubscribe@googlegroups.com.
To post to this group, send email to xdi2@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 




--
You received this message because you are subscribed to the Google Groups "XDI2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdi2+unsubscribe@googlegroups.com.
To post to this group, send email to xdi2@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "XDI2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdi2+unsubscribe@googlegroups.com.
To post to this group, send email to xdi2@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]