[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xri] Empty paths
Wil, see ### inline. From: Tan, William
[mailto:William.Tan@neustar.biz] Thanks Drummond. Maybe I need to ask a more fundamental
question, is XRI a scheme when viewed from the URI/IRI level? I guess it is
since when we translate an XRI to an IRI, the scheme will always be
“xri”. In this case, we are defining our own scheme-based
normalization rules, right? ### Yes. In the conversion to IRI normal
form, the scheme “xri:” MUST be added. And yes, at that point, we
are defining our own scheme-specific normalization rules, although to the
greatest extent possible, we should use the rules established by other
widely-used schemes. ### A more concrete problem I have is, I have
a function that returns the IRI-normal form of an XRI reference. According to
the definition of IRI-normal form, we do not have to perform any of the
normalization like removing empty path segments or removing optional
delimiters. The current openxri code does, and I’m not sure if I should
modify it to only do the slash and percentage sign escaping. ### I don’t know if there is any
hard-and-fast rule about this, but I lean towards believeing the current OpenXRI
code does the right thing, i.e., a “toIRINormalForm” function SHOULD
perform the normalization recommend in XRI Syntax 2.0 by removing empty path
segments and optional delimiters. ### ### This strikes me as especially true if
we want this code to become the defacto standard reference implementation. This
will help ensure consistent XRI normalization for the many different
applications that may come to rely on it. ### ### Gabe, others: how do you feel? ### =Drummond From: Drummond Reed
[mailto:drummond.reed@cordance.net] Wil, great questions, see ### inline. From: Tan, William
[mailto:William.Tan@neustar.biz] Hi all, While implementing the syntax parser in OpenXRI, I have
stumbled upon an issue with empty paths and canonical XRIs. Firstly, from what I read in the specs (section 2.2.5
Canonicalization), “XRI references do not have a single canonical
form”. In which case, I believe that converting an XRI reference to IRI
normal form should NOT muck with the optional reassignable subsegment
delimiter. It also follows that we should try to preserve empty path segments
as much as possible in the parser. ### Agreed. Empty path segments were
explicitly allowed in the Committee Specification of XRI Syntax 2.0 in order to
match the same behavior in URIs. I am having trouble with the “correct” parser
behavior when it encounters an empty path, such as xri://@foo/. When it comes
to comparison, should we treat xri://@foo and xri://@foo/ as the same? What
about xri://@foo/ and xri://@foo// (double slashes)? ### I think we should follow the guidance
in section 6.2.3 of RFC 3986 in that regard (quoted below). It treats a single
empty segment as equivalent to no empty segment, i.e., xri://@foo
and xri://@foo/ are the same. However
a double empty segment is NOT equivalent to a single empty segment, i.e.,
xri://@foo/ and xri://@foo// are NOT the same. Hope this helps, =Drummond 6.2.3. Scheme-Based Normalization
The syntax and semantics of URIs vary from scheme to scheme, as described by the defining specification for each scheme. Implementations may use scheme-specific rules, at further processing cost, to reduce the probability of false negatives. For example, because the "http" scheme makes use of an authority component, has a default port of "80", and defines an empty path to be equivalent to "/", the following four URIs are equivalent:
http://example.com http://example.com/ http://example.com:/ http://example.com:80/
In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/". Likewise, an explicit ":port", for which the port is empty or the default for the scheme, is equivalent to one where the port and its ":" delimiter are elided and thus should be removed by scheme-based normalization. For example, the second URI above is the normal form for the "http" scheme.
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]