xri message

Subject: Re: [xri] RE: #next: alternate to #replace
From: Breno de Medeiros <breno@google.com>
To: Eran Hammer-Lahav <eran@hueniverse.com>
Date: Thu, 17 Sep 2009 21:57:48 -0700
Okay, I see the metaphor that you are describing, but XML parsers are
not written as compilers.

My point is that since we have potentially many XRD choices for a
resource (depending on the start point), would it be a deal breaker to
have discovery potentially generate multiple independent XRDs prior to
final selection procedure? I guess the answer is no, as long as after
the selection procedure the result is unambiguous.

The processing rule for <Link> could make the order of preference
fixed so that one always picks the first XRD that matches a desirable
'rel' type. E.g.: depth-first-search.

For <Link>-independent <Type> discovery, only the root XRD would be used.

On Thu, Sep 17, 2009 at 9:41 PM, Eran Hammer-Lahav <eran@hueniverse.com> wrote:
> I don't think the fact that a consumer can find multiple descriptors for a single resource matters much. It can find descriptors using multiple links, multiple discovery protocols, and even descriptors in multiple formats. There is a lot that can be said about resources.
>
> When it comes to XRD, a consumer always starts with a single XRD document. XRD has no opinion about what a client should do when encountered with multiple 'describedby' links. That is up to applications to decide. Trying to create a world in which you "first gather everything you know about a resource into a big complete collection" is impractical.
>
> But within a single XRD root document, when following a single 'describedby' link, consumers must get consistent results when asking the same question. That is an absolute requirement. In other words, if different consumers ask the same XRD the same question, the answer must be the same. How you obtain the first XRD matters, but it doesn't change the answer (it might make the answer 'trusted' or 'not trusted', but it is still the very same answer).
>
> An XRD is a data set which consumers can query for data. The two main queries you make against this data set are:
>
> 1. Give me the list of all resource attributes in the order they appear.
> 2. Give me all the linked resources that match a criteria, in the order they appear.
>
> XRD processing is limited to the processing of information found by starting with a single XRD document. How you consolidate information gathered from multiple XRDs or discovery sources is completely out of scope. What is also out of scope is what a consumer should do when an XRD contains another 'describedby' link to another XRD. From a process standpoint, this is exactly the same as finding another XRD elsewhere for the same resource.
>
> The reason why we are discussing including some other processing directives such as #see-also or #replace is because we are a clear use case for managing the deployment of a single root XRD across multiple documents. From the perspective of the consumer obtaining the root XRD, it is a single document. Otherwise I agree with you that we end up with a mess because we have an inconsistent model of when we take into account other descriptors and when we don't, and it doesn't add up to the same view.
>
> When a C complier is processing project files, it usually comes across the same include file many times. But as far as the complier is concerned, it is working with a single document at a time. The include directive (which are often conditional) are used to construct the actual source code the complier sees. C include files are in no way atomic units. Only the root document after fully expanding all includes is the atomic unit used to create an object file.
>
> This is the right metaphor for XRD. You start with a single XRD and a single Subject and whatever else you drag in there from other XRD documents must be viewed as part of that root XRD. If you consider the importance of document order, it is clear that the order in which an XRD is included changes its meaning. So again, this has nothing to do with trying to build a complete view of resource descriptors, but still only a single view, that of the root XRD.
>
> I don't think we are likely to solve the problem of consolidating multiple descriptors for a single document anytime soon. This comes up daily when discussing link relations, how to pick links when more than one is present, how to pick links when no link provides the perfect combination of attributes, how to handle links in multiple transports, and what to do when the information contains conflicts. This is all very early and we must let individual protocols solve this.
>
> What we can and must do is make sure that within a single thread of discovery, that is, a single XRD root document, we have clear and predictable processing rules. That is the only problem we are tasked to solve.
>
> The use case for spreading a resource descriptor over multiple XRD documents (with a single root) is clear. The question is now finding the right balance between implementation complexity and deployment flexibility. This is not a new challenge. Almost everything we do here is dealing directly with that (see signatures, trust, priority, etc.).
>
> EHL
>
>
>
>> -----Original Message-----
>> From: Breno de Medeiros [mailto:breno@google.com]
>> Sent: Thursday, September 17, 2009 9:00 PM
>> To: Eran Hammer-Lahav
>> Cc: Drummond Reed; XRI TC
>> Subject: Re: [xri] RE: #next: alternate to #replace
>>
>> Okay, I had some more time to reflect about this.
>>
>> I am now of the opinion that incorporating <Type> from more than one
>> XRD is either wrong or at least not robust. The reason is that we have
>> written a spec that includes elements such as <Subject> which allows
>> an XRD for a resource to be resolvable from more than one location and
>> talk about the same resource. A corollary is that the meaning of an
>> XRD document should _not_ depend on the flow used to discover it.
>> There may be several flows leading to it, it should not matter.
>>
>> You can say, what about trust? Doesn't it depend on the flow. Yes, it
>> does, but trust _is_ a property of the flow, not of the resource.
>> Besides, it has only one of a small set of restricted values.
>>
>> That means that a the last XRD found to describe a resource must be
>> atomic. It can't depend on earlier XRDs. Any type of include would
>> violate this assumption which I thought had been fairly fundamental in
>> how we designed this whole architecture.
>>
>> My conclusion is that <Type> cannot flow from more than one XRD.
>> Neither can Links for that matter. XRDs are independent things, like
>> resources all onto themselves. That is a consequence of they being
>> flow-independent.
>>
>> Taking that as a start point, the outcome of processing a #see-also is
>> that XRD discovery would produce _two_ independent XRD documents. A
>> client could pick and choose the one XRD that satisfies its needs, but
>> could not merge or co-mingle them in any form.
>>
>> Of course, if we are willing to accept that a resource can have two
>> distinct XRDs, then by subsequent branching any number is already
>> possible, which means that no simplification would be achieved by
>> restricting #see-also to a single occurrence.
>>
>> I am not sure allowing multiple XRDs to describe a single resource is
>> too far a departure either. Fully embracing the concept that XRD is
>> flow-independent means that there may be more than one XRD for a
>> resource in any case, discoverable through different routes.
>>
>> Am I wrong, or is XRD discovery a schizophrenic where resources have
>> multiple-personality disorder?
>>
>> a #replace semantics that means, trash this XRD, take that one.
>>
>> a multiple value #see-also directive that says: I am a schizophrenic
>> resource with multiple personalities. Pick one.
>>
>> If we prohibit #see-also we make processing easier, but the resources
>> are still potentially schizophrenic because XRDs can pop up in
>> expected places.
>>
>>
>>
>> On Thu, Sep 17, 2009 at 7:33 PM, Eran Hammer-Lahav
>> <eran@hueniverse.com> wrote:
>> > Forget about 'required' for a second. If that is the only issue we
>> have I will be happy to drop it. The use case for it is that sometimes
>> it is unsafe to interact with an endpoint if its attributes are known.
>> It is the only way to "break" client from trying out talking to an
>> endpoint they *think* they understand but that requires an additional
>> add-on. This came up in OAuth discovery for requiring certain crypto
>> extensions (that without would not be secure enough). This is not a
>> blocking use case.
>> >
>> > Back to the main issue.
>> >
>> > I don't see how this is not just an include? A client looking to
>> learn about the resource attribute must still load and process each
>> #next XRD. If I want to write a client that loads XRD information into
>> memory, and then query that information as needed later, I will need to
>> load multiple XRD documents, process them, combine the information into
>> another flat container, and then search that.
>> >
>> > If there is only one XRD, I can just use the XML document object as
>> my memory store. As soon as I have attributes coming from multiple
>> documents, the client gets more complex. It doesn't matter if it is two
>> or two hundred chained XRD.
>> >
>> > Most XRD will be processed in three phases:
>> >
>> > 1. Load into memory
>> > 2. Review for known attributes (Type, XRD-level stuff)
>> > 3. Link selection
>> >
>> > #see-also and #next work well for #3 because the client *knows* what
>> it is looking for and can stop as soon as it finds it. So it can load
>> one document at a time, search it, and move on. They don't work for #2
>> without going through *all* the documents.
>> >
>> > The real question is how important is the include use case. If it is
>> a critical feature, then the cost on clients doesn't really matter. It
>> is just the cost of doing business.
>> >
>> > EHL
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: Drummond Reed [mailto:drummond.reed@cordance.net]
>> >> Sent: Thursday, September 17, 2009 6:40 PM
>> >> To: 'XRI TC'
>> >> Cc: Eran Hammer-Lahav
>> >> Subject: #next: alternate to #replace
>> >>
>> >> In today's (sparsely-attended) telecon we made a crucial decision to
>> >> remove
>> >> #see-also links between XRDs in favor of #replace links.
>> >>
>> >> The rational was simple: this is the only XRD linking model that
>> lets
>> >> us
>> >> keep the XML processing very simple: just load the XRD into an XML
>> >> parser,
>> >> query its elements/attributes as needed, and get on with your
>> business.
>> >> There is none of the "include" functionality that #see-also implies.
>> >> With
>> >> #replace links, an XRD consumer simply checks at the start of XRD
>> >> processing
>> >> to see if there is a #replace link. If so, it replaces the current
>> XRD.
>> >> If
>> >> not, it knows it has only one XRD to look at.
>> >>
>> >> I've been cooking on it since the call, and I feel uneasy about this
>> --
>> >> not
>> >> because it doesn't produce a cleaner, faster XRD processing model,
>> but
>> >> because it prohibits what I believe will be a very common XRD
>> >> delegation
>> >> pattern: having a "local" XRD that specifies certain common links an
>> >> author
>> >> might want to have very tight control over, but which then delegates
>> to
>> >> a
>> >> "hosted" XRD to handle everything else. Typical example: a blog
>> owner
>> >> that
>> >> wants to control certain resource properties and links from his/her
>> own
>> >> local blog XRD (for instance the mgmt interface might be right
>> inside
>> >> WordPress), and then delegates to an XRD host like Google or Yahoo
>> for
>> >> everything else.
>> >>
>> >> #replace does not support that model.
>> >>
>> >> So here's an alternative I'll call #next. #next does NOT have
>> include
>> >> semantics. �Instead it's "goto" semantics, i.e., an XRD consumer
>> simply
>> >> follows sequential processing of the elements in the first XRD
>> element
>> >> if/until it encounters a #next link (it doesn't even have to look
>> ahead
>> >> through the XML tree to see if there are any #next links). Once it
>> >> encounters a #next link, at that point it stops and loads the linked
>> >> XRD,
>> >> confirms it is valid, and continues processing from there (and DOES
>> NOT
>> >> ever
>> >> backtrack again to the previous XRD).
>> >>
>> >> The only thing the #next processing model requires is an adjustment
>> to
>> >> processing of required Type elements. There are two options for
>> this:
>> >>
>> >> 1) Drop the "required" attribute on Type. What is the use case for
>> >> this?
>> >> Should XRD discovery really halt if an XRD consumer does not
>> understand
>> >> a
>> >> particular Type? Or can this form of type-checking be handled at the
>> >> application layer?
>> >>
>> >> 2) If we keep the "required" attribute on Type, then make it apply
>> per-
>> >> XRD.
>> >> By this I mean that the XRD processor only looks at required <Type>
>> >> elements
>> >> on a per-XRD basis. For example, if it does not find a required
>> <Type>
>> >> element it does not understand in XRD #1, it processes it. Then, if
>> it
>> >> encounters a #next link, it moves to XRD #2, and applies the same
>> rule.
>> >> It
>> >> stops only if it encounters an XRD with a required <Type> element it
>> >> does
>> >> not understand.
>> >>
>> >> All this means is that if there is a required <Type> that applies to
>> a
>> >> resource in all contexts, then it must appear in all XRDs that
>> describe
>> >> the
>> >> resource. That does not seem like an unreasonable requirement for
>> XRD
>> >> providers at all, especially if that's all we need to do to preserve
>> >> the
>> >> ability to delegate an XRD without giving up all local control.
>> >>
>> >> In sum, #next preserves the ability to walk a directed graph of
>> XRDs,
>> >> moving
>> >> from each to the next, without having to complete replace each one.
>> Of
>> >> course it also gives us #replace semantics if the only link in the
>> >> source
>> >> XRD is a #next link.
>> >>
>> >> BTW, I agree with Eran that the rule should be: multiple #next links
>> >> are
>> >> allowed but are only processed if the previous #next links fail to
>> >> produce a
>> >> valid XRD. Otherwise they are ignored. This keeps it simple.
>> >>
>> >> Thoughts?
>> >>
>> >> =Drummond
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe from this mail list, you must leave the OASIS TC that
>> > generates this mail. �Follow this link to all your TCs in OASIS at:
>> > https://www.oasis-
>> open.org/apps/org/workgroup/portal/my_workgroups.php
>> >
>> >
>>
>>
>>
>> --
>> --Breno
>>
>> +1 (650) 214-1007 desk
>> +1 (408) 212-0135 (Grand Central)
>> MTV-41-3 : 383-A
>> PST (GMT-8) / PDT(GMT-7)
>



-- 
--Breno

+1 (650) 214-1007 desk
+1 (408) 212-0135 (Grand Central)
MTV-41-3 : 383-A
PST (GMT-8) / PDT(GMT-7)
Follow-Ups:
- RE: [xri] RE: #next: alternate to #replace
  - From: Eran Hammer-Lahav <eran@hueniverse.com>
References:
- Minutes: XRI TC Telecon 2-3PM PT Thursday 2009-09-17
  - From: "Drummond Reed" <drummond.reed@cordance.net>
- #next: alternate to #replace
  - From: "Drummond Reed" <drummond.reed@cordance.net>
- RE: #next: alternate to #replace
  - From: Eran Hammer-Lahav <eran@hueniverse.com>
- Re: [xri] RE: #next: alternate to #replace
  - From: Breno de Medeiros <breno@google.com>
- RE: [xri] RE: #next: alternate to #replace
  - From: Eran Hammer-Lahav <eran@hueniverse.com>