[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: XRI abstract syntax (was RE:Framing the 2.1 ABNF discussion -- the separation of abstract vs concrete syntaxes)
>Steve wrote: > >Up through XRI Syntax 2.0, the concrete syntax--reflected in the 2.0 >ABNF--has had a high degree of fidelity with the abstract syntax. For >example, the concrete syntax has explicit delimiters between the >subsegments, and cross references are delimited by surrounding parens. (Note >that it did not have full fidelity, for example, because the first >subsegment delimiters could be omitted between a left-most GCS character and >the next subsegment. That is a good thing.) Actually, Steve, I have come to believe it is a very bad thing. This seemingly small oversight has come back to bite us in the butt. See below for the reasons why. >I do not believe that anyone is proposing that 2.1 be a change the >*abstract* XRI syntax. However, because the TC has failed to frame the >discussion as such, it is difficult for me to tell. Yes, we are proposing a change to the abstract syntax, to fix exactly the problem above. See below. >XRI *does* have an abstract syntax. It is (essentially and informally): "an >XRI is made up of a sequence of segments, where each segment has zero or >more subsegments. Each subsegment can be either a terminal or a cross >reference." (Fragments, queries, persistence, and other minutiae have been >omitted in this description.) Unfortunately, when it comes to something as precise as ABNF for global identifiers, the devil is in the details. My personal belief is that it is in that minutiae that we glossed over a key issue in the 2.0 ABNF (the optional star), and my work on the 2.1 ABNF has given me progressively greater insight as to why that mistake is a much bigger deal than it looked like at the time. In fact it is exactly what you pointed out: ***because it means we got the abstract syntax wrong***!! The quickest way to explain this is to show what I believe the abstract syntax for XRI should be. Let's start with your explanation: "An XRI is made up of a sequence of segments, where each segment has zero or more subsegments." As beautifully simple as that sounds, I believe it has one key problem that will become clear in the explanation that follows. But first let's look at the second half of your definition of the abstract syntax: "Each subsegment can be either a terminal or a cross reference." This is where the real problem starts. As much as I'd LIKE to agree with that, take a look at the following SIX different definitions of a subsegment (xri-subseg) in the 2.0 ABNF and tell me whether it's true or false: #1: xri-subseg = ( "*" / "!" ) (xref / *xri-pchar) #2: xri-subseg-nc = ( "*" / "!" ) (xref / *xri-pchar-nc) #3: xri-subseg-od = [ "*" / "!" ] (xref / *xri-pchar) #4: xri-subseg-od-nz = [ "*" / "!" ] (xref / 1*xri-pchar) #5: xri-subseg-od-nx = [ "*" / "!" ] 1*xri-pchar-nc #6: xri-subseg-pt-nz = "!" (xref / 1*xri-pchar) Of these, two (#3 and #4 )allow a subsegment to consist of a cross-reference ("xref" in the ABNF -- a paren-delimited string with no preceeding delimiter). However #1, #2, and #6 REQUIRE an xref to be preceeded by a delimiter. And one definition -- #5 -- doesn't even ALLOW an xref to be part of a subsegment. Oops ;-) The one place where this highly convoluted set of rules (for which I am just as culpable as anyone else on the 2.0 ABNF team ;-) falls down the most is in the issue of the optional star. A conversation with Les and Wil and Trung on Monday helped me understand why. Since we WANTED the abstract syntax for XRI to be as simple as, "An XRI is a series of zero-or-more segments, and each segment is a series of zero-or-more subsegments", but at the same time we NEEDED special rules about the first subsegment in a segment (and even MORE special rules for the first subsegment of the authority segment), we approached the ABNF exactly that way: we made the abstract structure really simple and then wrote a whole bunch of special rules to *force* the concrete ABNF into the patterns we needed. And some of those special rules turned out to be so hairy that we decided not to even try to enforce them in the ABNF (because it would have gotten way too ugly), in particular the star-is-optional-after-a-GCS-character-or-a-slash-but-not-anywhere-else rule. So when we started work on the 2.1 Syntax work, our first resolution was to get rid of the optional star character in the ABNF (this was before the proposal-formerly-known-as-compact-syntax even existed). As I looked at how to do that, I had to work out exactly what you suggested -- the REAL abstract syntax of XRI -- and the resulting ABNF ended out reflecting it precisely. First, let me state it in prose, as a series of bullet points: * An XRI is made up of a sequence of one or more segments. * A segment may contain zero or more subsegments preceeded by an optional value. * A subsegment may be either a global reference or a local reference. * A global reference may contain an optional value and zero or more local references. * A local reference may contain only an optional value. * A value may be either one or more characters or an encapsulated reference. Now, here's the same thing in ABNF (these rules are from http://wiki.oasis-open.org/xri/XriCd02/XriAbnf2dot1, only reordered to match the abstract logic above): xri-hier-part = xri-authority xri-path-abempty xri-authority = 1*global-ref xri-path-abempty = *( "/" xri-segment ) xri-segment = [ ref-value ] *xri-subseg xri-subseg = global-ref / local-ref global-ref = gcs-char [ ref-value ] *local-ref local-ref = lcs-char [ ref-value ] gcs-char = "=" / "@" / "+" / "$" lcs-char = "*" / "!" ref-value = encap-ref / 1*xri-pchar encap-ref = "(" encap-ref-value ")" encap-ref-value = xri-reference / iri IMHO, the key difference in this abstract syntax vs. the 2.0 abstract syntax is that it explicitly contains the notion of a *reference value* ("ref-value" in the ABNF). A ref-value is NOT a subsegment. A subsegment is ALWAYS a delimiter followed by an optional ref-value. This is how we can now cleanly differentiate between: =example =*example =!example In the first case, we have a global reference whose ref-value is "example". In the second case, we have a global reference whose ref-value is EMTPY followed by a local reference of "*example", which is a reassignable reference to the ref-value of "example". In the third case, we have a global reference whose ref-value is EMTPY followed by a local reference of "!example", which is a persistent reference to the ref-value of "example". The other key thing this syntax clears up is the role of an *encapsulated reference* ("encap-ref" in the ABNF, which is the-rule-formerly-known-as-xref). Under this syntax, an encap-ref is ALWAYS a ref-value. Lastly, to Gabe's point about how much of an impact this revised ABNF would have on XRI 2.0 installations. My answer: none. Every XRI that validates under 2.0 will validate under 2.1 with a single exception (see below). 2.1 will allow XRIs that are not valid under 2.1 (specifically those that contain global-refs), but that's loosening the rules, not tightening the rules, so that should have no impact on currently deployed XRIs. The only exception is global XRIs assigned under the ! GCS symbol. As I have discussed with Les, these XRIs now need to resolve under the @ space, i.e., !!1003 needs to become @!!1003. Since !! XRIs have not yet been widely deployed, I am confident that we can deal smoothly with that one transition. The only other impact is on XRI resolution rules. These can now be defined very precisely as described in the latter part of: http://wiki.oasis-open.org/xri/XriCd02/GlobalRefs This impact should have no effect on existing deployed XRIs. Most importantly, since updates to the XDI.org GRS and OpenXRI code bases are currently waiting on us finishing XRI Resolution 2.0, and that is currently gated on closing on the 2.1 ABNF, closing on this ABNF is now our top priority. That's why it is a key subject of tomorrow's 10AM telecon (for which the agenda will go out shortly.) =Drummond
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]