From: Drummond Reed
Sent: Tuesday, March 13, 2007 4:20
To: Chasen, Les; firstname.lastname@example.org;
Subject: RE: [xri] Proposal for
XRI Syntax 2.1 to treat all delimiters as signficant
Les, Marty, see [=Drummond] inline in both
your messages below.
From: Chasen, Les
Sent: Tuesday, March 13, 2007 5:57
Subject: Re: [xri] Proposal for
XRI Syntax 2.1 to treat all delimiters as signficant
I think you and I are mostly in agreement on the use
of parens. The thing I don't understand is how removing the parens in a
cross reference makes it significantly easier for humans to read and write
[=Drummond] I believe the
motivations for the global ref proposal is very simple and clear: it allows you
to concatenate two or more single-segment absolute XRIs directly, with no
special rules about the use of delimiters or parentheses.
For example, take the following XRIs, all
of which are independently syntactically valid absolute XRIs:
Under the global refs proposal, you can
compose composite XRIs out of these component XRIs via direct contatenation
without any special rules. Examples:
I suspect that XRI TC members have been
working with parenthetical encapsulation of cross-references for so long that
the notion of direct concatenation of global references seems foreign. However
my experience with individuals outside of the TC is exactly the opposite. Since
they’ve never seen parenthetically-encapsulated identifiers, direct
concatenation is completely intuitive to them, whereas the concept of
parenthetially-encapsulated identifiers is foreign and must be learned.
that Drummond been working with Global-Refs for so long that the notion of
parentheses to indicate a reference almost seems foreign. When I showed the
v2.0 schema to a colleague at another company, he read an accompanying
paragraph and immediately recognized what was meant by the parentheses
(obviously he's lots smarter than me). So, my experience has been that the
use of parentheses is understandable (which is a good thing because the
Global-Refs proposal still needs parentheses for the cases Drummond described
below). As I read Drummond's experiences about other people's intuitive
reaction, I also thought it would be nice to conduct some sort of survey,
because I agree that by now we're probably all too close to this to recognize
what is intuitive to a new-comer.</marty>
I would go so far as to say that direct
concatenation of identifiers is so simple, intuitive, and useful that as a TC
we should be taking the opposite stance and asking how can we ELIMINATE the
need for parenthetical encapsulation wherever possible. When you take this
perspective, the places where we have found parenthetical encapsulation is
* Using URIs as cross-references (because
URIs have a different syntax)
* Using multi-segment XRIs as
cross-references (because you need to know where the cross-reference ends)
* Using multiple cross-references as a
single cross-reference (because you need to know where the cross-reference
Direct concatenation works in all other
cases, so I submit that these should be the only three cases where
parenthetical encapsulation is necessary.
See more [=Drummond] inline in
Marty’s message below.
----- Original Message -----
From: Schleiff, Marty <email@example.com>
To: Chasen, Les; Drummond Reed <firstname.lastname@example.org>;
Sent: Tue Mar 13 03:21:42 2007
Subject: RE: [xri] Proposal for XRI Syntax 2.1 to treat all delimiters as
Les, as I understand it Drummond's proposal is NOT to make the parens
optional. If there are parens present, then they were explicitly put
there for a reason (although I can't understand what such a reason would
be), and the XRI is NOT equivalent to a similar XRI without parens. So,
normalization would NOT remove any parens.
[=Drummond] Correct. By
the logic above, if an XRI author has a reason to not use direct concatenation
where it would be syntactically valid, that reason is known to the author and
should not be overrided by XRI normalization rules.
I don't favor the proposal. I want the use of parens to be more
intuitive, and the notion of parens being used to add clarity even if
they are not required is pretty intuitive to me.
[=Drummond] While I
understand this sentiment, I believe it needs to be weighed against the
simplicity and intuitiveness of direct concatenation. I doubt anyone on the TC
would argue that direct concatenation is the simplest and most intuitive method
of identifier construction there is. After all, every single sentence I’m
writing here is composed of directly concatenated words (where spaces are the
concatenated words in this sentence have no hierarchical relation to the
others. "Every single sentence I'm writing here after all, is composed of
words directly concatenated (where spaces are the delimiter)". My sentence
means the same thing as your sentence in the prior paragraph, even though the
words are in a different order. You can't do that with XRI. If we want to use
natural language as identifiers, then mine would be "the overweight
guy from Boeing that likes XRI, but doesn't like the Global-Refs
proposal", whereas Greg would be either "Greg" or "the
physically fit guy from Boeing that likes XRI, but doesn't have a clue about
Global-refs, but is getting a little tired of waiting for the spec to
mature". Oops! someone might think those aren't good identifiers because
they're not opaque!
[=Drummond] I’ll go
on record as saying that I believe the TC would be making a tremendous mistake
if, simply because of our evolutionary path in figuring this all out, we ignore
the power of direct concatenation in the construction of XRIs. I can’t
count the number of times that developers, when first exposed to XRI syntax,
have asked me, “Can’t you get rid of those complicated
I'm not a hard core developer, I am one of the people who repeatedly whine
about what I consider "over-support" of xrefs. But Drummond, please
recognize that I'm not complaining about parentheses; instead I'm complaining
about supporting the notion of xref in so many of our ABNF rules. For example,
if "$l" really refers to RFC3066, why do we need to support
"$l*(xref)"? An answer I keep hearing is "for extensibility".
Let the people responsible for RFC3066 extend it. If we want some namespace for
languages not covered in RFC3066, we could just set up another namespace (or
anybody else could)to name other languages -- maybe intergalactic languages
aren't covered in RFC3066, so someone might establish "$igl" to hold
inter-galactic languages, or some other namespace not even under "$".
Anyway, enuf ranting about that.</marty>
[=Drummond] But you
don’t need to take my word for it. If it would be more scientifically
objective for us to get the opinions of developers, Internet architects, and
others outside the TC about the relative value of direct concatenation vs.
parenthetical encapsulation, I’d be happy to help organize a feedback
I still suggest that
several of the issues with the earlier "compact syntax" proposal
not be issues at all if we limited the scope of compact syntax to a
single subsegment. Then "$(http://example.com)=gmb*sub1"
be equivalent to "$(http://example.com)*(=gmb)*sub1"
and clearly NOT
equivalent to "$(http://example.com)*(=gmb*sub1)".
the proposed normalization rule – that all delimiters be considered
significant – already tells you that these three XRIs are not equivalent.
"proposed normalization rule" only tells us that these three XRIs are
not equivalent IF the proposal gets accepted.</marty>
This seems intuitive
to me, because it's more like the mathmatical concise interpretation of
parens (as opposed to the grammatical willy-nilly use of parens -- where
it's up to the author to determine if parens have the same meaning as
commas or hyphens -- or if some other punctuation should be used).
[=Drummond] Although I
too have been tempted by a mathematical interpretation of parentheses in the
past, I now believe its a red herring. Identifiers are not operators. They are
identifiers. They identify resources using a sequence of characters. Therefore
the fewer normalization rules that are necessary to change this sequence of
characters, the better.
mean that parens should be treated exactly like they are in math; instead, I
mean the definition of how to treat parens should be concise (like the math
usage is concise vs. the english usage is pretty free-form).</marty>
Identifiers are used to look up things, and looking things up requires
matching rules, and matching rules are frustrated by syntax that's not
[=Drummond] Agreed. I
believe that the syntax in http://wiki.oasis-open.org/xri/XriCd02/XriAbnf2dot1
is the most precise we’ve ever had, and the corresponding normalization
rules are the simplest we’ve ever had.
I also don't like the new proposal's notion of an empty ref-value
because I can't comprehend what it means. It is certainly not intuitive
to me. I think the notion seems contrived and that it just exists to
make the syntax proposal work - not a very noble reason.
[=Drummond] All the
proposed 2.1 ABNF states is that a ref-value to be optional, which is no
different that the 2.0 ABNF. An empty ref-value is no different than an empty
subsegment or an empty segment. We didn’t invent the concept of an empty
segment -- URI and IRI have long explicitly supported that concept. In URI and
IRI syntax, every segment contains a ref-value (they don’t call it that,
but it’s the same thing), and it’s not required, so it can be empty.
[=Drummond] There are
many uses for an empty segment, although I would say that few if any of them
are “intuitive”. Like the number zero, its inherently a complex
<marty>So far I
have heard zero uses for an empty segment, and I don't understand a single one
of them. Does anyone have an example?</marty>
Sadly, I don't know how the older compact syntax proposal can do for the
XDI use case of double parens (which BTW is yet another use case I don't
[=Drummond] I strongly
believe that at this point we need to generalize from specific use cases to
general principles. We have spent several years developing XRI syntax to meet
an overall set of design principles. The proposal at http://wiki.oasis-open.org/xri/XriCd02/XriAbnf2dot1
is the simplest and most general ABNF we’ve ever had. To Steve’s
point, it is now as close to a completely abstract syntax as we have been able
[=Drummond] The key point
I’m trying to make it that designing the syntax to the Einsteinein
dictate of “as simple as possible but no simpler” means that we
will have done our best to design a syntax that can be adapted to hundreds of
future use cases for identifiers that we have never anticipated, just as we
have use cases today that we never anticipated three years ago. To use
parenthetical encapsulation as an example: who are we to say to say to the XDI
TC – or any any future producer of XRIs and XRI-based identifier
construction algorithms – that double or triple or quadruple parentheses
should not be significant? The fact that the XDI TC was the first to come up
with that usage reinforces for me the importance of keeping the rules as simple
as possible: if all delimiters are significant, then its very clear what XRIs
can be evaluated as equivalent directly vs. what XRIs must be evaluated as
synonyms via local policy or resolution.
say? They syntax authors! The fact that the XDI TC was the first to come
up with that usage reinforces for me the need to do a better job of
defining the syntax (and maybe even the semantics) then was done at the time
the XDI TC came up with that usage. No slam intended - this is of course
exactly what we're all trying to do.</marty>
Here's a new question inspired by the examples in the previous messages.
Are the following equivalent?
[=Drummond] Again, by the
current proposed normalization rule, none of the four XRIs above are
equivalent. To begin with, they all state that the resource is being identified
in a different global context, so that rules out syntactical equivalence. It
would almost be like saying =drummond and @cordance*drummond should be
syntactically equivalent. They might be proven to be synonyms in that they
identify the same resource via resolution, yes, but to say anything about
syntactical equivalence of identifiers that are not syntactically equivalent
outside of a set of very narrow normalization rules is something I think we
should avoid at all costs.
What determines which GCS to use? I think I'd favor having an xref
directly following a GCS character to be valid for only one of the GCS
characters (probably the $ or the +).
that kind of special exception is antithetical to the concept of a simple,
generalized syntax for XRIs. See my next message to the list regarding this
point, as I believe it’s critical to coming to closure on XRI Syntax 2.1.
when you commented on my statement of preference, you did not attempt to answer
the question; i.e., what determines which GCS to use?</marty>