[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xri] Issue 1 Subthread - freeing colon for producer-specific algorithms
[Note: I changed the subject of this thread to
introduce a
subthread about an important aspect of
this decision. I just wish email would
better support
this type of subthreading.]
Dave has said he believes
RFC2396bisv6 does not allow
use of a reserved character within a segment if
there
is no defined meaning for that character within a
segment. However I
have read section 2.2 on reserved
characters closely and I believe it
explicitly allows
the use of reserved characters that are not defined
as
delimiters with a segment.
The full text of section 2.2 is quoted
at the end of
this message, but the specific sentence I would
highlight
is: "Thus, characters in the reserved set
are protected from normalization
and are therefore
safe to be used by scheme-specific and
producer-specific
algorithms for delimiting data
subcomponents within a URI."
What this
means is that if colon is NOT reserved by
the XRI spec as a "scheme-specific
delimiter", but
only as a subsegment decorator (to use Gabe's term for
a
character that only has a defined meaning when used
in first position after
another delimiter), then it
frees colon to be used elsewhere within a
subsegment
as determined by "producer-specific algorithms".
I believe
this is a very significant benefit of not
defining colon as a delimiter. With
the large number
of URI reserved chars that we have defined
as
scheme-specific delimiters in XRI syntax, it leaves
very few chars to
be used as delimiters by
producer-specific algorithms. I believe colon is
a
particularly attractive character for this purpose
(second only to dot.)
I already posted (about a month
ago) an example of one potential
producer-specific
algorithm (in this case for XRI authority
subsegments)
in which it would be attractive to use colons. I can
only
imagine that there are many more.
The other advantage is that this
preserves
backwards-compatability with XRI 1.0 XRIs because the
colons
that appear in these as scheme-specific
delimiters under XRI 1.0 syntax would
still be legal
as producer-specific delimiters under XRI 1.1 - the
only
difference is how colons in the XRI authority
segment would be interpreted by
XRI 1.1 resolvers.
=Drummond
The references from
http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html#reserved
is
quoted below:
2.2 Reserved Characters
URIs include components and
subcomponents that are
delimited by characters in the "reserved" set.
These
characters are called "reserved" because they may (or
may not) be
defined as delimiters by the generic
syntax, by each scheme-specific syntax,
or by the
implementation-specific syntax of a URI's
dereferencing
algorithm. If data for a URI component
would conflict with a reserved
character's purpose as
a delimiter, then the conflicting data must
be
percent-encoded before forming the URI.
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]"
/
"@"
sub-delims = "!" / "$" / "&" / "'" / "("
/
")"
/ "*" / "+" / "," / ";" / "="
The purpose of reserved characters is to
provide a set
of delimiting characters that are distinguishable from
other
data within a URI. URIs that differ in the
replacement of a reserved
character with its
corresponding percent-encoded octet are not
equivalent.
Percent-encoding a reserved character, or
decoding a percent-encoded octet
that corresponds to a
reserved character, will change how the URI
is
interpreted by most applications. Thus, characters in
the reserved set
are protected from normalization and
are therefore safe to be used by
scheme-specific and
producer-specific algorithms for delimiting
data
subcomponents within a URI.
A subset of the reserved characters
(gen-delims) are
used as delimiters of the generic URI
components
described in Section 3. A component's ABNF syntax rule
will not
use the reserved or gen-delims rule names
directly; instead, each syntax rule
lists the
characters allowed within that component (i.e., not
delimiting
it) and any of those characters that are
also in the reserved set are
"reserved" for use as
subcomponent delimiters within the component. Only
the
most common subcomponents are defined by this
specification; other
subcomponents may be defined by a
URI scheme's specification, or by
the
implementation-specific syntax of a URI's
dereferencing algorithm,
provided that such
subcomponents are delimited by characters in
the
reserved set allowed within that component.
URI producing
applications should percent-encode data
octets that correspond to characters
in the reserved
set. However, if a reserved character is found in a
URI
component and no delimiting role is known for that
character, then it should
be interpreted as
representing the data octet corresponding to
that
character's encoding in US-ASCII.
--- Dave McAlpin
<Dave.McAlpin@epok.net> wrote:
> Are you suggesting that the :
between 12 and 34
> would be considered a
> regular
character, not a delimiter? If so, I
don't
> think that's
legal
> per 2396bis.
>
>
Dave
>
> > -----Original Message-----
>
> From: Fen Labalme [mailto:fen@idcommons.org]
>
> Sent: Thursday, July 08, 2004 11:03 AM
> > To: Loren
West
> > Cc: xri@lists.oasis-open.org
> >
Subject: Re: [xri] Issue 1: Clarifying *
Semantics
>
>
> > Loren -
> >
> >
Note that :12:34 would still be a legal
persistent
> identifier, it
just
> > would
> > not imply a separation (or
delegation)
between two
> parts. In other
>
words,
> > it
> > is similar to the identifier
:12.34 (using
the new
> semantics for dot
> as
a
> > regular character).
> >
>
> In my strongly held opinion, if we are going
to
> make
any
> simplifications,
> > they
> >
should be aimed at making the semantics
easier to
> understand and
the
> human
> > friendly identifiers simpler and
easier to
read
> and (humanly) parse.
> I
>
> believe that is what this proposed
simplification
>
does. If it does so
> at
> > a
>
> slight cost to the human readability of
non-human
>
(machine) friendly
> > identifiers, that's a good
decision.
> >
> > Fen
>
>
> >
> > Loren West wrote:
>
> > I understand how you see a single
separator as
a
> simplification,
> > > and hope you can
understand how I see
":" as a
> simplification
> >
> over "*:". They're both "simpler", but
one
> doesn't
require
> > > a change to the specification.
>
>
> >
> > To unsubscribe from this
mailing list (and
be
> removed from the roster
>
of
> > the OASIS TC), go to http://www.oasis-
>
>
>
open.org/apps/org/workgroup/xri/members/leave_workgroup.php.
>
>
>
To unsubscribe from this mailing list (and be
> removed from the
roster of the OASIS TC), go to
>
http://www.oasis-open.org/apps/org/workgroup/xri/members/leave_workgroup.php.
>
To
unsubscribe from this mailing list (and be removed from the roster of the OASIS
TC), go to http://www.oasis-open.org/apps/org/workgroup/xri/members/leave_workgroup.php.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]