OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xri message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: XRI abstract syntax (was RE:Framing the 2.1 ABNF discussion -- the separation of abstract vs concrete syntaxes)


>Steve wrote:
>
>Up through XRI Syntax 2.0, the concrete syntax--reflected in the 2.0
>ABNF--has had a high degree of fidelity with the abstract syntax. For
>example, the concrete syntax has explicit delimiters between the
>subsegments, and cross references are delimited by surrounding parens.
(Note
>that it did not have full fidelity, for example, because the first
>subsegment delimiters could be omitted between a left-most GCS character
and
>the next subsegment. That is a good thing.)

Actually, Steve, I have come to believe it is a very bad thing. This
seemingly small oversight has come back to bite us in the butt. See below
for the reasons why.

>I do not believe that anyone is proposing that 2.1 be a change the
>*abstract* XRI syntax. However, because the TC has failed to frame the
>discussion as such, it is difficult for me to tell.

Yes, we are proposing a change to the abstract syntax, to fix exactly the
problem above. See below.

>XRI *does* have an abstract syntax. It is (essentially and informally): "an
>XRI is made up of a sequence of segments, where each segment has zero or
>more subsegments. Each subsegment can be either a terminal or a cross
>reference." (Fragments, queries, persistence, and other minutiae have been
>omitted in this description.)

Unfortunately, when it comes to something as precise as ABNF for global
identifiers, the devil is in the details. My personal belief is that it is
in that minutiae that we glossed over a key issue in the 2.0 ABNF (the
optional star), and my work on the 2.1 ABNF has given me progressively
greater insight as to why that mistake is a much bigger deal than it looked
like at the time. In fact it is exactly what you pointed out: ***because it
means we got the abstract syntax wrong***!!

The quickest way to explain this is to show what I believe the abstract
syntax for XRI should be. Let's start with your explanation:

"An XRI is made up of a sequence of segments, where each segment has zero or
more subsegments."

As beautifully simple as that sounds, I believe it has one key problem that
will become clear in the explanation that follows. But first let's look at
the second half of your definition of the abstract syntax:

"Each subsegment can be either a terminal or a cross reference."

This is where the real problem starts. As much as I'd LIKE to agree with
that, take a look at the following SIX different definitions of a subsegment
(xri-subseg) in the 2.0 ABNF and tell me whether it's true or false:

#1:	xri-subseg        = ( "*" / "!" ) (xref / *xri-pchar)
#2:	xri-subseg-nc     = ( "*" / "!" ) (xref / *xri-pchar-nc)
#3:	xri-subseg-od     = [ "*" / "!" ] (xref / *xri-pchar)
#4:	xri-subseg-od-nz  = [ "*" / "!" ] (xref / 1*xri-pchar)
#5:	xri-subseg-od-nx  = [ "*" / "!" ] 1*xri-pchar-nc
#6:	xri-subseg-pt-nz  = "!" (xref / 1*xri-pchar)

Of these, two (#3 and #4 )allow a subsegment to consist of a cross-reference
("xref" in the ABNF -- a paren-delimited string with no preceeding
delimiter). However #1, #2, and #6 REQUIRE an xref to be preceeded by a
delimiter. And one definition -- #5 -- doesn't even ALLOW an xref to be part
of a subsegment.

Oops ;-)

The one place where this highly convoluted set of rules (for which I am just
as culpable as anyone else on the 2.0 ABNF team ;-) falls down the most is
in the issue of the optional star. A conversation with Les and Wil and Trung
on Monday helped me understand why.

Since we WANTED the abstract syntax for XRI to be as simple as, "An XRI is a
series of zero-or-more segments, and each segment is a series of
zero-or-more subsegments", but at the same time we NEEDED special rules
about the first subsegment in a segment (and even MORE special rules for the
first subsegment of the authority segment), we approached the ABNF exactly
that way: we made the abstract structure really simple and then wrote a
whole bunch of special rules to *force* the concrete ABNF into the patterns
we needed.

And some of those special rules turned out to be so hairy that we decided
not to even try to enforce them in the ABNF (because it would have gotten
way too ugly), in particular the
star-is-optional-after-a-GCS-character-or-a-slash-but-not-anywhere-else
rule.

So when we started work on the 2.1 Syntax work, our first resolution was to
get rid of the optional star character in the ABNF (this was before the
proposal-formerly-known-as-compact-syntax even existed). As I looked at how
to do that, I had to work out exactly what you suggested -- the REAL
abstract syntax of XRI -- and the resulting ABNF ended out reflecting it
precisely.

First, let me state it in prose, as a series of bullet points:

* An XRI is made up of a sequence of one or more segments.
* A segment may contain zero or more subsegments preceeded by an optional
value. 
* A subsegment may be either a global reference or a local reference.  
* A global reference may contain an optional value and zero or more local
references.
* A local reference may contain only an optional value.
* A value may be either one or more characters or an encapsulated reference.

Now, here's the same thing in ABNF (these rules are from
http://wiki.oasis-open.org/xri/XriCd02/XriAbnf2dot1, only reordered to match
the abstract logic above):

xri-hier-part     = xri-authority xri-path-abempty 
xri-authority     = 1*global-ref
xri-path-abempty  = *( "/" xri-segment )
xri-segment       = [ ref-value ] *xri-subseg
xri-subseg        = global-ref
                  / local-ref
global-ref        = gcs-char [ ref-value ] *local-ref
local-ref         = lcs-char [ ref-value ]
gcs-char          = "=" / "@" / "+" / "$"
lcs-char          = "*" / "!"
ref-value         = encap-ref
                  / 1*xri-pchar
encap-ref         = "(" encap-ref-value ")"
encap-ref-value   = xri-reference
                  / iri

IMHO, the key difference in this abstract syntax vs. the 2.0 abstract syntax
is that it explicitly contains the notion of a *reference value*
("ref-value" in the ABNF). A ref-value is NOT a subsegment. A subsegment is
ALWAYS a delimiter followed by an optional ref-value. This is how we can now
cleanly differentiate between:

	=example
	=*example
	=!example

In the first case, we have a global reference whose ref-value is "example".
In the second case, we have a global reference whose ref-value is EMTPY
followed by a local reference of "*example", which is a reassignable
reference to the ref-value of "example". In the third case, we have a global
reference whose ref-value is EMTPY followed by a local reference of
"!example", which is a persistent reference to the ref-value of "example".

The other key thing this syntax clears up is the role of an *encapsulated
reference* ("encap-ref" in the ABNF, which is
the-rule-formerly-known-as-xref). Under this syntax, an encap-ref is ALWAYS
a ref-value.

Lastly, to Gabe's point about how much of an impact this revised ABNF would
have on XRI 2.0 installations. My answer: none. Every XRI that validates
under 2.0 will validate under 2.1 with a single exception (see below). 2.1
will allow XRIs that are not valid under 2.1 (specifically those that
contain global-refs), but that's loosening the rules, not tightening the
rules, so that should have no impact on currently deployed XRIs.

The only exception is global XRIs assigned under the ! GCS symbol. As I have
discussed with Les, these XRIs now need to resolve under the @ space, i.e.,
!!1003 needs to become @!!1003. Since !! XRIs have not yet been widely
deployed, I am confident that we can deal smoothly with that one transition.

The only other impact is on XRI resolution rules. These can now be defined
very precisely as described in the latter part of:

	http://wiki.oasis-open.org/xri/XriCd02/GlobalRefs

This impact should have no effect on existing deployed XRIs. Most
importantly, since updates to the XDI.org GRS and OpenXRI code bases are
currently waiting on us finishing XRI Resolution 2.0, and that is currently
gated on closing on the 2.1 ABNF, closing on this ABNF is now our top
priority. That's why it is a key subject of tomorrow's 10AM telecon (for
which the agenda will go out shortly.)

=Drummond 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]