[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xri] Issue 1 Subthread - freeing colon for producer-specific algorithms
[Note: I changed the subject of this thread to introduce a subthread about an important aspect of this decision. I just wish email would better support this type of subthreading.] Dave has said he believes RFC2396bisv6 does not allow use of a reserved character within a segment if there is no defined meaning for that character within a segment. However I have read section 2.2 on reserved characters closely and I believe it explicitly allows the use of reserved characters that are not defined as delimiters with a segment. The full text of section 2.2 is quoted at the end of this message, but the specific sentence I would highlight is: "Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI." What this means is that if colon is NOT reserved by the XRI spec as a "scheme-specific delimiter", but only as a subsegment decorator (to use Gabe's term for a character that only has a defined meaning when used in first position after another delimiter), then it frees colon to be used elsewhere within a subsegment as determined by "producer-specific algorithms". I believe this is a very significant benefit of not defining colon as a delimiter. With the large number of URI reserved chars that we have defined as scheme-specific delimiters in XRI syntax, it leaves very few chars to be used as delimiters by producer-specific algorithms. I believe colon is a particularly attractive character for this purpose (second only to dot.) I already posted (about a month ago) an example of one potential producer-specific algorithm (in this case for XRI authority subsegments) in which it would be attractive to use colons. I can only imagine that there are many more. The other advantage is that this preserves backwards-compatability with XRI 1.0 XRIs because the colons that appear in these as scheme-specific delimiters under XRI 1.0 syntax would still be legal as producer-specific delimiters under XRI 1.1 - the only difference is how colons in the XRI authority segment would be interpreted by XRI 1.1 resolvers. =Drummond The references from http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html#reserved is quoted below: 2.2 Reserved Characters URIs include components and subcomponents that are delimited by characters in the "reserved" set. These characters are called "reserved" because they may (or may not) be defined as delimiters by the generic syntax, by each scheme-specific syntax, or by the implementation-specific syntax of a URI's dereferencing algorithm. If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before forming the URI. reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI. A subset of the reserved characters (gen-delims) are used as delimiters of the generic URI components described in Section 3. A component's ABNF syntax rule will not use the reserved or gen-delims rule names directly; instead, each syntax rule lists the characters allowed within that component (i.e., not delimiting it) and any of those characters that are also in the reserved set are "reserved" for use as subcomponent delimiters within the component. Only the most common subcomponents are defined by this specification; other subcomponents may be defined by a URI scheme's specification, or by the implementation-specific syntax of a URI's dereferencing algorithm, provided that such subcomponents are delimited by characters in the reserved set allowed within that component. URI producing applications should percent-encode data octets that correspond to characters in the reserved set. However, if a reserved character is found in a URI component and no delimiting role is known for that character, then it should be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII. --- Dave McAlpin <Dave.McAlpin@epok.net> wrote: > Are you suggesting that the : between 12 and 34 > would be considered a > regular character, not a delimiter? If so, I don't > think that's legal > per 2396bis. > > Dave > > > -----Original Message----- > > From: Fen Labalme [mailto:fen@idcommons.org] > > Sent: Thursday, July 08, 2004 11:03 AM > > To: Loren West > > Cc: xri@lists.oasis-open.org > > Subject: Re: [xri] Issue 1: Clarifying * Semantics > > > > Loren - > > > > Note that :12:34 would still be a legal persistent > identifier, it just > > would > > not imply a separation (or delegation) between two > parts. In other > words, > > it > > is similar to the identifier :12.34 (using the new > semantics for dot > as a > > regular character). > > > > In my strongly held opinion, if we are going to > make any > simplifications, > > they > > should be aimed at making the semantics easier to > understand and the > human > > friendly identifiers simpler and easier to read > and (humanly) parse. > I > > believe that is what this proposed simplification > does. If it does so > at > > a > > slight cost to the human readability of non-human > (machine) friendly > > identifiers, that's a good decision. > > > > Fen > > > > > > Loren West wrote: > > > I understand how you see a single separator as a > simplification, > > > and hope you can understand how I see ":" as a > simplification > > > over "*:". They're both "simpler", but one > doesn't require > > > a change to the specification. > > > > > > To unsubscribe from this mailing list (and be > removed from the roster > of > > the OASIS TC), go to http://www.oasis- > > > open.org/apps/org/workgroup/xri/members/leave_workgroup.php. > > > To unsubscribe from this mailing list (and be > removed from the roster of the OASIS TC), go to > http://www.oasis-open.org/apps/org/workgroup/xri/members/leave_workgroup.php. >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]