OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xri-editors message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xri-editors] XRIs and canonical form


Dave-
	Well, that *could* be the rule, and I think it *should* be the rule. But I wasn't sure, since we don't talk about that in the resolution section, and since we don't have a "canonical" form for resolution now. 

	On the other hand, from a "writing the text of the spec" point of view, its somewhat easier to be silent on the whole issue and force servers to do canonicalization (which would include removing the insigificant x-refs). 

	Its a fork in the road:
option a: resolver servers can expect fully xri-canonicalized XRIs, which requires us to specify canonicalization rules. This would allow supersimple resolvers (maybe even just plain old HTTP servers with a sprinkle of configuration but no XRI-specific code)

option b: resolver servers have to do whatever canonicalization they need to do internally and there is no guarantee that what gets resolved is canonicalized. this makes the spec simpler I think, but makes server implementations a little more complicated.

	We just need to make a choice, and its sounding like you are suggesting option b. I'm slightly leaning toward a, but only slightly and don't really care all that much so long as we are explicit about the choice. I'm perfectly willing to let you make the decision, Dave - just as long as you are aware of the tradeoff I see.

	Several points:

1) I don't know if this would be a big effect, but more normalization *before* resolution would increase the effects of caches, I would think 

2) We can be silent on this and then encourage everyone in implementations to "canonicalize" before sending XRIs on the wire during resolution. This would ameliorate the effects of point #1 

	-Gabe

> -----Original Message-----
> From: Dave McAlpin [mailto:dave.mcalpin@epokinc.com]
> Sent: Wednesday, November 12, 2003 1:41 PM
> To: 'Wachob, Gabe'; xri-editors@lists.oasis-open.org
> Subject: RE: [xri-editors] XRIs and canonical form
> 
> 
> If you don't want servers to deal with insignificant xrefs, 
> why couldn't the
> rule be that clients shouldn't pass them into a resolver?
> 
> As to a server's responsibility, I think it should have at 
> least some notion
> of equivalence. It seems wrong to me that something like 
> xri:@Example is
> unresolvable if the client fails to convert it to @example.
> 
> Dave
> 
> > -----Original Message-----
> > From: Wachob, Gabe [mailto:gwachob@visa.com]
> > Sent: Wednesday, November 12, 2003 1:09 PM
> > To: Wachob, Gabe; 'Dave McAlpin'; 'xri-editors@lists.oasis-open.org'
> > Subject: RE: [xri-editors] XRIs and canonical form
> > 
> > The reason I say this is that we need to work on the 
> resolution section
> > specifically say that these cross references must be removed before
> > resolution.
> > 
> > It'd be nice to be able to say that identifiers MUST 
> (SHOULD) be in the
> > "normalized" form. With the wording here, there is no such 
> form, so I
> > think we'll end up wanting to put in specific text for each of these
> > rules. We already have specific text to deal with the 
> "optional segment-
> > leading ."
> > 
> > I was hoping we could say there is a "maximally 
> canonicalized form" that
> > we could refer to. As I said before, if we don't define a 
> particular form
> > for resolutoin, then XRI local access and naming authority 
> servers are
> > going to have to do canonicalization along with resolvers. 
> With a single
> > canonicalized (mostly) form, only the resolvers have to worry about
> > canonicalization.
> > 
> > I was *hoping* we'd define resolution only w/r/t the 
> "canonical" form (so
> > servers don't have to do canonicalization), but maybe we 
> are explicitly
> > deciding that servers will have to do canonicalization.
> > 
> > 	-Gabe
> > 
> > > -----Original Message-----
> > > From: Wachob, Gabe
> > > Sent: Wednesday, November 12, 2003 1:04 PM
> > > To: 'Dave McAlpin'; Wachob, Gabe; xri-editors@lists.oasis-open.org
> > > Subject: RE: [xri-editors] XRIs and canonical form
> > >
> > >
> > > What about removing $! and ! cross-references?
> > >
> > > 	-Gabe
> > >
> > > > -----Original Message-----
> > > > From: Dave McAlpin [mailto:dave.mcalpin@epokinc.com]
> > > > Sent: Wednesday, November 12, 2003 11:18 AM
> > > > To: 'Wachob, Gabe'; xri-editors@lists.oasis-open.org
> > > > Subject: RE: [xri-editors] XRIs and canonical form
> > > >
> > > >
> > > > I'll clean it up before I incorporate it into the spec, but I
> > > > just wanted to
> > > > see how people felt about the kind of non-normative, guidance
> > > > tone this
> > > > takes.
> > > >
> > > > In general, XRIs do not have a single canonical form. This is
> > > > especially
> > > > true for XRIs that contain non-XRI cross-references, since
> > > > many URI schemes
> > > > do not define a canonical form. HTTP URIs, for example, do
> > > > not have a single
> > > > canonical form. It is therefore true by definition that an
> > > XRI with a
> > > > cross-reference containing an HTTP URI does not have a single
> > > > canonical
> > > > form. With that said, it is valuable to define guidelines for
> > > > making XRIs
> > > > reasonably canonical. XRIs that follow these guidelines 
> will be more
> > > > consistent in presentation, simpler to process and less prone to
> > > > false-negative comparisons.
> > > >
> > > > Absent a compelling reason to do otherwise, those who
> > > > generate or reference
> > > > XRIs should provide them in a form in which
> > > >
> > > > - The optional xri scheme is added
> > > > - The scheme is provided in lowercase
> > > > - The optional leading dot in xri-segments is omitted
> > > > - Percent-escaping uses uppercase A through F
> > > > - The authority component is in lowercase
> > > > - Unnecessary escaping is removed
> > > > - /./ and /../ are absent in absolute XRIs
> > > > - Cross-references are reasonably canonical with respect to
> > > > their schemes
> > > >
> > > > Examples
> > > >
> > > > Non-canonical             Canonical          Comment
> > > > @example                  xri:@example       Add optional scheme
> > > > XRI:@example              xri:@example       Lowercase scheme
> > > > xri:@example/.abc         xri:@example/abc   Remove optional
> > > > leading dot
> > > > xri:@example%2f           xri:@example@2F    Uppercase when
> > > > percent escaping
> > > > xri:@Example              xri:@example       Lowercase authority
> > > > xri:@ex%61mple            xri:@example       Remove
> > > > unnecessary escaping
> > > > xri:@example/./abc        xri:@example/abc   Resolve 
> /./ and /../
> > > >
> > > > > -----Original Message-----
> > > > > From: Wachob, Gabe [mailto:gwachob@visa.com]
> > > > > Sent: Tuesday, November 11, 2003 12:47 PM
> > > > > To: 'Dave McAlpin'; Wachob, Gabe; 
> xri-editors@lists.oasis-open.org
> > > > > Subject: RE: [xri-editors] XRIs and canonical form
> > > > >
> > > > > The argument against requiring canonical form for
> > > > resolution is that it
> > > > > puts the canonicalization requirement on the client
> > > (instead of the
> > > > > server).
> > > > >
> > > > > So I guess I'm ok with some non-normative text, if you feel
> > > > thats the best
> > > > > way. But I would at least suggest that by making it
> > > > non-normative, we
> > > > > would have to mention in resolution that a local access
> > > > server (and a
> > > > > naming authority server, perhaps) has to implement either
> > > > canonicalization
> > > > > or equivalence rules locally (on the server side). Clients
> > > > (resolvers)
> > > > > will *also* have to do this if they ever want to compare XRIs.
> > > > >
> > > > > 	-Gabe
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Dave McAlpin [mailto:dave.mcalpin@epokinc.com]
> > > > > > Sent: Tuesday, November 11, 2003 12:14 PM
> > > > > > To: 'Wachob, Gabe'; xri-editors@lists.oasis-open.org
> > > > > > Subject: RE: [xri-editors] XRIs and canonical form
> > > > > >
> > > > > >
> > > > > > Right, I just want to make sure we're comfortable tackling
> > > > > > something that
> > > > > > others have chosen not to define. HTTP 1.0 (RFC1945)
> > > > provided a simple
> > > > > > definition of canonical form but it was removed in 2068,
> > > > > > apparently after
> > > > > > some thought and discussion. Massinter's comment below is an
> > > > > > argument not to
> > > > > > reintroduce the concept in 2616.
> > > > > >
> > > > > > 2396 avoided the issue entirely and 2396bis only
> > > provides general,
> > > > > > non-normative guidance.
> > > > > >
> > > > > > All those specs chose to focus on equivalence rules rather
> > > > > > than a canonical
> > > > > > form, which is exactly what we've done up until 
> now. I'm just
> > > > > > asking if
> > > > > > we're really sure about this before I start working on it.
> > > > > >
> > > > > > Dave
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Wachob, Gabe [mailto:gwachob@visa.com]
> > > > > > > Sent: Tuesday, November 11, 2003 11:32 AM
> > > > > > > To: 'Dave McAlpin'; Wachob, Gabe;
> > > > xri-editors@lists.oasis-open.org
> > > > > > > Subject: RE: [xri-editors] XRIs and canonical form
> > > > > > >
> > > > > > > While this is true (about canonicalization of cross
> > > > > > references), I think
> > > > > > > its perfectly reasonable to talk about canonicalization of
> > > > > > those parts we
> > > > > > > have control over (ie XRI-defined syntax).  Perhaps
> > > > > > "canonical" is too
> > > > > > > strong a word.
> > > > > > >
> > > > > > > I suspect that the vast majority of XRIs will not
> > > contains cross
> > > > > > > references containing other URI schemes. And in 
> those cases
> > > > > > where they do,
> > > > > > > maybe we'll have to live with the fact that there is not
> > > > > > one canonical
> > > > > > > form.
> > > > > > >
> > > > > > > Maybe we call it "XRI canonicalized URI form" to suggest
> > > > > > that its only
> > > > > > > canonicalized as far as XRI syntax goes..
> > > > > > >
> > > > > > > 	-Gabe
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Dave McAlpin [mailto:dave.mcalpin@epokinc.com]
> > > > > > > > Sent: Tuesday, November 11, 2003 11:06 AM
> > > > > > > > To: 'Wachob, Gabe'; xri-editors@lists.oasis-open.org
> > > > > > > > Subject: RE: [xri-editors] XRIs and canonical form
> > > > > > > >
> > > > > > > >
> > > > > > > > It's interesting that neither 2396 nor 2616 defined a
> > > > > > canonical form.
> > > > > > > > 2396bis defines some "good practices" for making URIs
> > > > "reasonably
> > > > > > > > canonical", but they don't attempt anything 
> normative. The
> > > > > > > > following post by
> > > > > > > > Larry Masinter is instructive.
> > > > > > > >
> > > > > > > > > In general, URLs do _not_ have a canonical form.
> > > > However, HTTP
> > > > > > > > > defines some equivalences for URLs (e.g., that
> > > > http://host is
> > > > > > > > > equivalent to http://host/, and by using the generic
> > > > > > > > > syntax for host names, the host part is case 
> insensitive).
> > > > > > > > >
> > > > > > > > > Some particular HTTP servers MAY define other
> > > equivalences,
> > > > > > > > > e.g., that http://host/dir is equivalent to
> > > http://host/dir/
> > > > > > > > > and to http://host/dir/index.html.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Given that URIs don't have a normative 
> canonical form, it's
> > > > > > > > hard to see how
> > > > > > > > we can define a canonical form for XRIs that contain
> > > > > > cross-references.
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Wachob, Gabe [mailto:gwachob@visa.com]
> > > > > > > > > Sent: Tuesday, November 11, 2003 10:58 AM
> > > > > > > > > To: 'Dave McAlpin'; xri-editors@lists.oasis-open.org
> > > > > > > > > Subject: RE: [xri-editors] XRIs and canonical form
> > > > > > > > >
> > > > > > > > > I think canonical form is sort an arbitrary, but well
> > > > > > > > understood "state"
> > > > > > > > > of an identifier.
> > > > > > > > >
> > > > > > > > > When an identifier is in canonical form, it should be
> > > > > > > > possible to compare
> > > > > > > > > it with another identifier in canonical form and the
> > > > > > > > process of comparing
> > > > > > > > > the two character-by-character (or in the case of
> > > > > > > > canonicalized URIs, byte
> > > > > > > > > for byte) is exactly the process of applying 
> the built-in
> > > > > > > > equivalence
> > > > > > > > > rules in the XRI spec.
> > > > > > > > >
> > > > > > > > > Does this make sense? I mentioned the 
> leading-. issue, the
> > > > > > > > $! and ! cross
> > > > > > > > > references. One other thing that would be useful to
> > > > describe for
> > > > > > > > > canonicalization is the uppercasing of %HH 
> (hex digit)..
> > > > > > > > >
> > > > > > > > > If we define resolution to operate only on
> > > > > > canonicalized forms of
> > > > > > > > > identifiers, it potentially makes the 
> deployment of XRI
> > > > > > local access
> > > > > > > > > servers MUCH simpler as they don't have to apply
> > > any of the
> > > > > > > > "built-in"
> > > > > > > > > equivalence rules themselves. They just have to make
> > > > > > sure that they
> > > > > > > > > resolve the one canonical form...
> > > > > > > > >
> > > > > > > > > 	-Gabe
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Dave McAlpin [mailto:dave.mcalpin@epokinc.com]
> > > > > > > > > > Sent: Tuesday, November 11, 2003 10:50 AM
> > > > > > > > > > To: xri-editors@lists.oasis-open.org
> > > > > > > > > > Subject: [xri-editors] XRIs and canonical form
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I've been asked to draft text specifying a
> > > > "canonical form"
> > > > > > > > > > for XRIs. I
> > > > > > > > > > wanted to start by understanding what canonical form
> > > > > > > > meant for URIs in
> > > > > > > > > > general, and in searching the web I came across
> > > > the following
> > > > > > > > > > exchange. The
> > > > > > > > > > initial question is from Terence Spielman of
> > > > Visa, followed
> > > > > > > > > > by Gabe's and my
> > > > > > > > > > responses. Just interesting that we've 
> considered this
> > > > > > > > > > question before.
> > > > > > > > > >
> > > > > > > > > > Dave
> > > > > > > > > >
> > > > > > > > > > >>>In addition, aside from unresolvable references,
> > > > > > is it possible
> > > > > > > > > > >>> to canonicalize XRIs?  This is a highly
> > > > desireable feature
> > > > > > > > > > >>> (for equivalence, at a minimum).
> > > > > > > > > >
> > > > > > > > > > >>We talked quite a bit about this. The decision
> > > > was made to
> > > > > > > > > > be silent on
> > > > > > > > > > >>canonicalization because equivalence is actually
> > > > > > > > > > unambigious given the
> > > > > > > > > > >>rules stated. Now, that doesn't mean that its at
> > > > > > all obvious.
> > > > > > > > > > >>
> > > > > > > > > > >>I do think giving names to the escaped vs.
> > > > unescpaed forms
> > > > > > > > > > of XRI, at
> > > > > > > > > > >>least, would be useful.  Canonicalization would
> > > > then just
> > > > > > > > > > be transforming
> > > > > > > > > > >>an identifier into one of those forms. We
> > > didn't want to
> > > > > > > > > > mandate a single
> > > > > > > > > > >>canonical form because different environments
> > > would need
> > > > > > > > > > XRIs in different
> > > > > > > > > > >>levels of escaping and it would be unfortunate to
> > > > > > > > require a specific
> > > > > > > > > > >>canonicalization form that would require
> > > > otherwise-unneeded
> > > > > > > > > > transformation.
> > > > > > > > > > >>
> > > > > > > > > > >>Again, Dave McAlpin probably has better 
> input on this.
> > > > > > > > > >
> > > > > > > > > > >A canonical representation might be useful for
> > > > comparison,
> > > > > > > > > > but it would
> > > > > > > > > > >involve a formal definition of things like 
> "minimally
> > > > > > > > > > escaped", which would
> > > > > > > > > > >be fairly difficult to nail down. It would
> > > also depend on
> > > > > > > > > > the existence of
> > > > > > > > > > >a canonical form for URIs used as
> > > > cross-references. In other
> > > > > > > > > > words, an XRI
> > > > > > > > > > >wouldn't have a canonical form if it contained
> > > > > > > > > > cross-references that didn't
> > > > > > > > > > >define a canonical form.
> > > > > > > > > > >
> > > > > > > > > > >Note that equivalence rules are generally
> > > > problematic. The
> > > > > > > > > > IRI proposal,
> > > > > > > > > > >for example, completely dodges the question of
> > > > equivalence
> > > > > > > > > > when it says,
> > > > > > > > > > >"There is no general rule or procedure to decide
> > > > whether two
> > > > > > > > > > arbitrary IRIs
> > > > > > > > > > >are equivalent or not... Each specification or
> > > > application
> > > > > > > > > > that uses IRIs
> > > > > > > > > > >has to decide on the appropriate criterion for IRI
> > > > > > > > > > equivalence." 2396bis
> > > > > > > > > > >notes that even terms like "different" and
> > > > "equivalent" are
> > > > > > > > > > fuzzy in the
> > > > > > > > > > >general spec and ultimately application dependent.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > To unsubscribe from this mailing list (and be
> > > removed from
> > > > > > > > > > the roster of the OASIS TC), go to
> > > > > > > > > >
> > > > http://www.oasis-open.org/apps/org/workgroup/xri-editors/membe
> > > > > > > > > rs/leave_workgroup.php.
> > > > > > > > >
> > > > > > > > > To unsubscribe from this mailing list (and be 
> removed from
> > > > > > > > the roster of
> > > > > > > > > the OASIS TC), go to
> > > > > > > > http://www.oasis-open.org/apps/org/workgroup/xri-
> > > > > > > > > editors/members/leave_workgroup.php.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > To unsubscribe from this mailing list (and be 
> removed from
> > > > > > > > the roster of the OASIS TC), go to
> > > > > > > >
> > > http://www.oasis-open.org/apps/org/workgroup/xri-editors/membe
> > > > > > > rs/leave_workgroup.php.
> > > > > > >
> > > > > > > To unsubscribe from this mailing list (and be removed from
> > > > > > the roster of
> > > > > > > the OASIS TC), go to
> > > > > > http://www.oasis-open.org/apps/org/workgroup/xri-
> > > > > > > editors/members/leave_workgroup.php.
> > > > > >
> > > > > >
> > > > > >
> > > >
> > > >
> > > >
> > > > To unsubscribe from this mailing list (and be removed from
> > > > the roster of the OASIS TC), go to
> > > > http://www.oasis-open.org/apps/org/workgroup/xri-editors/membe
> > > rs/leave_workgroup.php.
> > >
> > 
> > To unsubscribe from this mailing list (and be removed from 
> the roster of
> > the OASIS TC), go to 
> http://www.oasis-open.org/apps/org/workgroup/xri-
> > editors/members/leave_workgroup.php.
> 
> 
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]