xri message

Subject: RE: [xri] Stable XRI 1.1 ABNF

From: "Dave McAlpin" <Dave.McAlpin@epok.net>
To: "Wachob, Gabe" <gwachob@visa.com>, "Lindelsee, Mike " <mlindels@visa.com>, <xri@lists.oasis-open.org>
Date: Fri, 29 Oct 2004 09:54:14 -0400

Title: RE: [xri] Stable XRI 1.1 ABNF

>2. ireg-name (the production that is used to allow DNS names) allows strings that
>    are not legal DNS names. While IRI allows this, our opinion is that in XRI,
>    we should further restrict this to support only valid DNS names. This turns out
>    to be quite a chore, though, if we allow internationalized DNS names.

I disagree. IRI defines the transformation from ireg-name to a legal DNS name. The idea is to allow a fully internationalized IRI and just to refer to to the IRI spec for conversion rules to a legal URI. I really don't want to reproduce that transformation.

[Wachob, Gabe]
Actually, what IRI says is sorta disappointing. It says that if a scheme declares that the ireg-name is a DNS name, but doesn't allow %-escaped ireg-name section (section 3.1), then you use a IRI-defined algorithm (RFC 3490 - including punycode). So if we allow MORE than DNS names, then implementations can't know a priori to convert the ireg-name to a punycoded DNS name. So we need to declare that the ireg-name is always DNS or we can't rely on section 3.1.

Well, we already know that ireg-name isn't always DNS (xri authorities end up getting put into ireg-name when IRI-conversion occurs), so we can't rely on the IRI conversion specified in section 3.1. We at least have to say that sometimes you do punycode conversion (or %-escaped) and sometimes you do straight %-escaping. And when you do each depends on XRI-specific rules. So at the very least, we have to add a step at the beginning of IRI section 3.1...

Now, how do we determine when an ireg-name is a dns-name or not? Well, not all strings can be dns names. For example, there is no way that $ or ( or ^ or ; or \ or | can appear in an internationalized DNS name (even though the euro symbol can). This is because ascii chars under hex 7F are not converted by the punycode conversion, wherease all other unicode characters are. So we could get into a situation where, if we use the plain ireg-name production without further restriction, we could get authority names that cannot possibly be DNS or XRI authorities.

We thought it would simply be better to restrict the use of ireg-name production in XRI to legal internationalized DNS names so this situation does not come up.

[Dave McAlpin]

This sounds like something we should discuss further. Since I don't think it affects the BNF, can we leave it as an open issue and revisit when draft the transformation rules?

>4. There doesn't seem to be a need to restrict the first segment in the path to
>    being a non-null segment. IRI needs to do this to avoid ambiguity, but the XRI
>    ABNF doesn't have this ambiguity. Our recommendation is to replace the
>    xri-path-absolute production as follows:
>
>        xri-path-absolute = "/" [ xri-segment ] *[ "/" xri-segment]
>
>    This would allow us to get rid of the xri-segment-nz and xri-subseg-od-nz
>    productions.
>

I considered this, but disallowing //foo does two things - it avoids a potentially misleading XRI (in //foo, foo looks like an authority) and it avoids an awkward rule when transforming to a URI. I left this in on purpose and, unless there's a compelling to allow it, I suggest we keep the current restriction.

[Wachob, Gabe]
Wait, //foo doesn't look like an authority without the scheme (xri:). No more than the HTTP URL //foo looks like an absolute HTTP URL. And "//foo" is a perfectly legal relative HTTP URL. I would argue for consistency - if "//foo" is a legal HTTP relative URL, then we should allow a similar XRI URL. And whats the awkward conversion rule? I don't see the complication.

[Dave McAlpin]

As an URI it does. If "//foo" was a URI reference, "foo" would be interpreted as an authority (see URI-reference, relative-ref, relative-part). The transformation isn't all that bad I guess - we'd just need to express it as ".//foo" when it was converted to an URI. My preference, though, is to have the same rule (and the same expression as ".//foo") in XRI to avoid confusion.

Dave