RE: [External] Re: [xdi] Embedding XDI in documents

How would the embedding syntax proposed, e.g. [@example] work with wiki syntax, which uses the following syntax for links: [http://www.example.org link name] (resulting in link name) [from wikipedia] - or for that matter footnote references such as [1]?

Kind regards,

Bill Barnhill

Booz Allen Hamilton - Belcamp,MD

barnhill_william@bah.com

Cell: 1-443-924-0824

Desk: 1-443-861-9102

From: xdi@lists.oasis-open.org [xdi@lists.oasis-open.org] on behalf of Markus Sabadello [markus.sabadello@gmail.com]
Sent: Wednesday, February 06, 2013 6:45 AM
To: Drummond Reed
Cc: Joseph Boyle; xdi2@googlegroups.com; OASIS - XDI TC
Subject: [External] Re: [xdi] Embedding XDI in documents

I'm just not sure if ABNF rules are needed for this.

Why can't we just describe it informally, i.e. "if you want to embed XDI in plain text, do this, if you want to embed it in JSON, do that"?

Markus

On Wed, Feb 6, 2013 at 2:00 AM, Drummond Reed <drummond.reed@xdi.org> wrote:

Joseph, I agree with both your points, but my proposal about the xdi-text production was that the "other language" I'm talking about is plain text. There's a ton of it out there, including email, word processing documents, browser text (like where I'm typing right now), etc. Since plain text entry is supported in so many places, many text editors include "smart" text processing such as looking for strings that start with "http:" or "https:" and turning them into URLS. That's the functionality I'm thinking about when proposing the xdi-text rules for using [/ ] to delimit XDI addresses and [{ }] to delimit XDI JSON in running text.

As far as embedding XDI in XDI, I think that's already covered in XDI grammar. To the extent that you'd want to embed XDI within a literal (e.g., using XDI as a "markup language"), then my suggestion above about embedding XDI in running text would still apply.

=Drummond
On Tue, Feb 5, 2013 at 4:41 PM, Joseph Boyle <planetwork@josephboyle.net> wrote:
To embed XDI in another language, the host language's grammar specifies what constructs are allowed, and we have to obey those, unless it's feasible for us to seek changes in the other language.

To embed XDI in XDI we can specify a syntax - but let's consider the actual use cases for embedding XDI in XDI as strings instead of parseable XDI statements.
On Feb 5, 2013, at 2:41 PM, Markus Sabadello <markus.sabadello@gmail.com> wrote:
I think I agree we don't need a leading delimiter for IRIs.

I'm now totally against any "bare literals", except in an xref.

I also agree the xdi-text and xdi-json could be useful, but I don't how this is relevant to the other ABNF questions at hand.

I think in IRIs, parens should be percent-encoded to avoid confusion with XDI xrefs.
Markus
On Tue, Feb 5, 2013 at 12:30 AM, Drummond Reed <drummond@connect.me> wrote:
Joseph, this is really cool!

It also highlights the important decision we have to make about bare literals. Seeing it put this way makes it even clearer how streamlined the ABNF can be if we allow bare literals.

Over the weekend I also had an idea about how to deal with the bare literal problem in the first segment of an XDI address when it exists in the wild (which may be an edge case, but still one we need to deal with).

The idea is for the xdi-text rule that I propose in https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion (which is basically to enclose XDI addresses or XDI JSON documents that appear running text insidesquare brackets to make them easy to recognize and parse) to use an additional forward slash to prefix the first segment of an XDI address. So instead of an XDI text block that contains an XDI statement consisting of all bare literals looking like:

[abc/def/xyz]

....it would look like this:

[/abc/def/xyz]

The reason I particularly like this is that now an XDI text block in any running text or markup document would be recognizable using just two rules:
An embedded XDI address would always start with [/
An embedded XDI JSON document would always start with [{

Examples of the first rule:
[/=drummond]
[/=drummond/+friend/(http://xdi.org/user/markus)]
Example of the second rule:
[{"=drummond/+friend":["(:http://xdi.org/user/markus)"]}]
If this approach can solve the problem of bare literals being allowed at the start of a first segment, then the question is: should we allow bare literals at the beginning of any segment in order to support this very streamlined ABNF parsing?

The second question is: should we stay with the current approach of just delimiting an IRI inside a cross-reference by looking for the colon following the scheme name (which is required by IRI syntax), or should we require a leading delimiter? My gut is the same as Joseph's here, which is that it is okay to parse for the colon following the IRI scheme name. Even though this is a little bit slower than just looking for a leading colon, it is simpler because it only requires "wrapping" the URI in parentheses.

How do others feel about these two questions?

=Drummond
On Mon, Feb 4, 2013 at 11:01 AM, Joseph Boyle <planetwork@josephboyle.net> wrote:

Now not up to date, but for reference, I made these the other day with http://railroad.my28msec.com/rr/ui :

<address.png>

<subseg.png>

<xref.png>

<literal.png>

The EBNF input was:

address ::= subseg+ ('/' subseg+ ('/' subseg+)?)?

subseg ::= [=@+$] [*!] (xref | literal)

xref ::= '(' (IRI | address) ')'

literal ::= (iunreserved | pct-encoded | [&;,':])+

On Feb 3, 2013, at 6:58 PM, Joseph Boyle <planetwork@josephboyle.net> wrote:

Drummond,

You are correct, excluding a specific trivial case can actually force more complexity in rules. The old ABNF had some examples of this. This is one reason why allowing a bare literal as a segment seems more natural to me.

The xref rule with added initial colon might need more grouping brackets: "(" [ [ ":" IRI ] / address ] ")"

I actually think allowing simply (http:// … ) with its own noninitial colon as an IRI xref would only add a little (finite) complexity to parsing, as opposed to some of the exponentially growing parse trees we may have been hitting in the past, and would look good for XDI's first-class support of IRIs - I posted a comment on this to https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion earlier today.

Also noted the tel: and sms: schemes can have matched parentheses in their bodies, so if we allow these we may have to allow matched parentheses in IRIs, and do parenthetical depth counting as we parse IRIs, unless we require clients to escape and unescape all the internal parens. If we're scanning for parens, checking for the internal colon after the scheme is not much additional work.

Joseph

On Feb 3, 2013, at 2:33 PM, Drummond Reed <drummond@connect.me> wrote:

Joseph,

First, thanks very much for this analysis of the ABNF. I hadn't appreciated it in detail until I studied after Friday's telecon. Condensing it down to four lines is a FANTASTIC way of seeing the essence of the ABNF.

Based on our discussion on Friday's call, and if you follow the recommendations I posted to https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion (namely, not allowing colons in literals, and using colons to prefix IRIs within cross-references), here's a revised version of your four-line ABNF if bare literals are allowed to begin segments:

OPTION #1: IF BARE LITERALS ARE ALLOWED

address = 1*subseg [ "/" 1*subseg [ "/" 1*subseg ] ] ;

subseg = [ "=" / "@" / "+" / "$" ] [ "*" / "!" ] [ xref / literal ] ;

xref = "(" [ ":" IRI / address ] ")";

literal = 1*[ iunreserved / pct-encoded ] ;

If bare literals are NOT allowed, as in the proposal we discussed on Friday, then I could only condense the ABNF into six rules

OPTION #2: IF BARE LITERALS ARE NOT ALLOWED

address = 1*subseg [ "/" 1*subseg [ "/" 1*subseg ] ]

subseg = global / local / xref

global = ( "=" / "@" / "+" / "$" ) [ "*" / "!" ] [ xref / literal ]

local = ( "*" / "!" ) [ xref / literal ]

xref = "(" [ ":" IRI / address / literal ] ")"

literal = 1*[ iunreserved / pct-encoded ]

Two questions:

Am I missing something - do you see a way to compact it further?

Will there be any real difference in efficiency of parsing between these two (given that Option #2 is actually narrower than Option #1 because it excludes bare literals)?

Thanks,

=Drummond

On Fri, Feb 1, 2013 at 9:34 AM, Joseph Boyle <planetwork@josephboyle.net> wrote:

Markus, thanks for the recognition, glad to be able to help out.

Drummond, do we need to exclude bare literals as segments at the syntax level? It seems to me they may be semantically trivial, but are syntactically consistent.

Just experimenting with finding a minimal set of verification rules (for clarity, omitting naming all the productions we want as parsing results) if bare literals are allowed, the grammar can be as short as:

address = 1*subseg [ "/" 1*subseg [ "/" 1*subseg ] ] ;

subseg = [ "=" / "@" / "+" / "$" ] [ "*" / "!" ] [ xref / literal ] ;

xref = "(" [ IRI / address ] ")";

literal = 1*[ iunreserved / pct-encoded / "&" / ";" / "," / "'" / ":" ] ;

On Jan 31, 2013, at 11:30 PM, Drummond Reed <drummond@connect.me> wrote:

Markus, thanks, this is great work. I have reviewed this and am in agreement with the changes.

The support for a literal as a standalone value at the start of a XDI segment has always been somewhat theoretical, i.e., we originally did it that way to not rule it out (because the preceeding slash could be a delimiter). But that does not work for the first segment of an XDI address.

So I agree that it's cleaner to just require all XDI segments to start with delimited subsegments.

I'll add this to the agenda for tomorrow's telecon.

=Drummond

On Thu, Jan 31, 2013 at 5:51 PM, Markus Sabadello <markus.sabadello@xdi.org> wrote:

Hello XDI TC,

Based on implementation experience and some discussions, I added another slightly changed version of the XDI ABNF to the discussion page of the relevant proposal:

https://wiki.oasis-open.org/xdi/XdiAbnf/Discussion

Here's the summary from the page:

1. Some of the changes here are motivated by the insight that the purpose of an ABNF is not only to validate a string against a set of rules, but also to semantically understand the various components of that string.

2. The "xdi-inner-graph" rule is introduced, in order to have an explicit rule for this fundamental XDI construct. This change doesn't affect what is valid XDI and what is not.

3. The "xdi-context" rule is introduced, for the same reason.

4. The "xdi-segment" rule is changed to no longer permit a literal at the beginning. A segment that does not start with a context symbol, and is not a cross-reference, does not appear to be useful, and it might be ambiguous with regard to other rules.

5. The "xref-literal" rule is introduced, in order to still allow literals in cross-references.

I tested this ABNF in the XDI2 library, and it appears to work fine.

In fact, I have recently added to XDI2 support for a new parser library (APG), in addition to the one I had been using before (aParse).

After evaluating them both, my conclusion is that they are both able to handle the XDI ABNF, that they produce the same results, and that APG is about twice as fast as aParse.

So APG will now be standard in XDI2, but aParse is optionally also still supported.

I have spent quite some time thinking about Joseph Boyle's ideas about optimizing the parsing process in smart ways, for example by simply "skipping" from an opening "(" to a closing ")" in order to avoid having to descend deep into the IRI rules. This sounds quite good to me, I just haven't found a way to actually implement that yet in a way that still ensures robustness and correctness of the parsing process. I think it was also Joseph who early on suggested that XRI parsing might be one of the most resource-intensive tasks of an XDI server, and I think that is very right. So while switching to a faster parsing library is a great step, we'll keep looking for further optimizations.

You can use the following tool to experiment with the most recent ABNF proposal I mentioned above:

http://xdi2.projectdanube.org/XRIParser

Markus

--
You received this message because you are subscribed to the Google Groups "XDI2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdi2+unsubscribe@googlegroups.com.
To post to this group, send email to xdi2@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "XDI2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdi2+unsubscribe@googlegroups.com.
To post to this group, send email to xdi2@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

xdi message