[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [wsn] Subscribe and GetCurrentMessage
I'm assuming that defining a policy statement for this would include defining a message format for the updates? /Sam On Mon, 2004-12-06 at 11:10 -0500, David Hull wrote: > Inline is my $0.02, in lieu of my attending the concall: > > Peter Niblett wrote: > > > > > > I think that we should avoid solutions that rely on sequence numbers. They > > come under the heading of Reliable Messaging, and as such are out of scope. > > A "policy" option in the Subscribe Request that says "send me the snapshot > > first before any updates" seems by far the cleanest way of avoiding the > > race condition > > > > Some questions for us to discuss on today's call: > > > > 1. Is this a good approach? > > > I'm fine with it. > > 2. Should we standardise this policy, or leave it to be a vendor extension? > > Leaving it as an extension simplifies our work.. but is this something that > > will be needed by WSRF RP? > > > In other words, where does this belong in the priority list? I would > prefer to see it addressed in the first version, and to that end I'll > take an AI to write it up for the Policy document (which should stir > back to life before long :-) > > 3. Who decides which messages are snapshots and which are events? I assume > > that in the direct notification case, this is the job of the NP. However in > > Brokered Notification, how does a broker know which messages need to be > > retained as snapshots? Is this another vendor extension? > > > The NP decides, and the Consumer is assumed to know how to interpret. > There are many possible approaches: > * Updates are some form of transform on the XML content of the > snapshot, presumably in the form of XSLT > * Updates are commands from some agreed-upon set, e.g. > setReferenceProperty or SQL. > * Snapshots are flat records and updates are just name, value > pairs (or sets thereof). > * Snapshots are arrays and updates are sets of (index, value) > pairs. Strictly speaking this is a special case of the above. > We may or may not want to define some of these basic patterns, but it > seems like some sort of dialectability is needed in any case. There's > also the recurring issue of the NP advertising its capabilities. > > In the brokered case, the broker should just pass through to its > delegatee. Snapshots and updates should look the same from the > outside. A clever broker might be able to maintain snapshots even > when the delegatee doesn't (e.g., sequence numbers happen to be > present, and the broker has some out-of-band way of establishing > initial state), but in that case it's more an aggregator acting as an > NP on its own behalf than a broker. > > 4. If we have this policy option, do we still need getCurrentMessage? > I'm hoping not -- that's why I asked for the known use cases for > getCurrentMessage. Note that even if we don't standardize > snapshot/update, we can still get rid of getCurrentMessage, as there > is nothing in BaseN minus getCurrentMessage to prevent someone from > implementing it as an extension under SubscriptionPolicy. > > In such issues, I generally see two major concerns: > * Making sure that the core specification doesn't prevent > important use cases from being handled cleanly. I had > originally thought that BaseN effectively required a > sequence-number approach to snapshots, which would have been > unacceptable to me. I now believe that BaseN allows > snapshot/update to be handled cleanly via policy. > * Making sure that useful patterns that could pose interop > problems if left to vendor extensions are standardized, even > if there is no structural need for them. This is a matter of > priority -- we don't have time for them all, at least not in > v1.0, and it's not a showstopper. If chaos erupts over, say, > the best way to specify queuing for a subscription, we can > jump in and standardize that without ripping up the existing > work (see previous item). However, I also believe that a > standard that handles common and well-understood cases clearly > and concisely, even as addenda, will fare better than a > strictly minimal one which leaves them to be decided by > vendors, and doing this up front will be cheaper for everyone > than waiting for chaos to erupt. > > Two > > possible reasons for it > > a) If a consumer crashes before it had a chance to save its update, it > > can retrieve the latest update, without having to create a new subscription > > > This seems like a special case of reliable delivery and queuing. I > don't think getCurrentMessage solves enough of this problem to be > worth its weight, especially since the general problem has to be > solved anyway. > > b) It allows entities other than the NC to receive an update (if they > > have access to the SubManager EPR) > > > How does getCurrentMessage allow for this? I thought it was tied to > topics, not subscriptions? > > > > > > > > Peter Niblett > > > > > > > > > > "Martin Chapman" > > <martin.chapman@o > > racle.com> To > > "'Lily Liu'" > > 03/12/2004 20:28 <lily.liu@webmethods.com>, "'David > > Hull'" <dmh@tibco.com> > > cc > > "'wsn-oasis'" > > <wsn@lists.oasis-open.org> > > Subject > > RE: [wsn] Subscribe and > > GetCurrentMessage > > > > > > > > > > > > > > > > > > > > > > +1 > > -----Original Message----- > > From: Lily Liu [mailto:lily.liu@webmethods.com] > > Sent: 03 December 2004 20:11 > > To: 'David Hull' > > Cc: wsn-oasis > > Subject: RE: [wsn] Subscribe and GetCurrentMessage > > > > SubscribePolicy seems to be the cleanest approach to me also. Although the > > snapshot/update case is a very valid use scenario, I think we should leave > > that policy definition to vendor extensions instead of covering it in the > > WSN policy spec. > > > > Lily > > -----Original Message----- > > From: David Hull [mailto:dmh@tibco.com] > > Sent: Wednesday, December 01, 2004 9:59 AM > > Cc: wsn-oasis > > Subject: Re: [wsn] Subscribe and GetCurrentMessage > > > > I would say that the problem can be solved by telling the NP directly to > > send the appropriate form of [snapshot][update*] directly to the consumer, > > and this can be done within the existing framework several ways: > > Some utterance in SubscribePolicy > > Some special flag on the topic (which might break commutativity) > > Some special element in <filter> (ditto) > > New attributes on SubscribeRequest (which is disruptive to the > > current spec and may raise issues with WSN) > > Magically named topics (yuck) > > Further, it can't be easily solved without telling the NP directly what to > > send. The best we can do involves sequence numbers and queuing (and the > > NP must keep track of the sequence numbers itself). > > > > Of the choices above, SubscribePolicy seems the cleanest approach. If so, > > I would like to see something on it in the Policy document. > > > > Samuel Meder wrote: > > On Tue, 2004-11-30 at 16:15 -0500, David Hull wrote: > > > > I think part of the confusion is that we've managed to solve > > part of > > the problem over the course of the discussion. > > > > There is a very basic race condition to avoid. We have to be > > sure > > that the NP doesn't get a snapshot request until after the > > subscription for updates has arrived. We can get around that > > by > > having the Subscriber subscribe first, and only make the > > snapshot > > request after it receives a response from the subscription. > > We then > > only have to assume that the NP will not send back a > > subscription > > response until after it has established the subscription. > > This seems > > reasonable, but we should talk about it explicitly -- a > > brokered or > > indirect implementation may not be able to make quite that > > guarantee. > > > > Given this, we have to make sure the consumer knows which > > updates are > > applicable. This is solved by putting a sequence number in > > each > > update, and making sure that each snapshot contains the > > sequence > > number of the update the most current update. The consumer > > then > > queues updates (per topic) until it receives a snapshot, and > > discards > > those that predate the snapshot. We assume that the consumer > > knows to > > do this, presumably by some out-of-band communication between > > it and > > the Subscriber. > > > > The scenario I described, of not being able to tell late > > updates from > > no updates, is only relevant if the Consumer doesn't trust the > > Subscriber to do things in the right order. If the Consumer > > knows > > that the Subscriber doesn't ask for the snapshot until it gets > > a valid > > SubscribeResponse, then the arrival of a snapshot guarantees > > that any > > relevant updates will arrive (possibly preceded by some > > irrelevant > > ones). > > > > However, there's still at least one remaining question: How > > does the > > subscriber know the snapshot arrived? The NP doesn't > > necessarily > > know, so there would have to be some positive acknowledgment > > from the > > Consumer to the Subscriber. Otherwise, the Subscriber might > > have to > > retry, and the Consumer would have to ignore duplicate > > snapshots. > > > > > > I think you'd have the same problem with any of the update messages > > as > > well, although I agree that sending the snapshots and the updates > > over > > the same channel makes things easier. > > > > > > So to make this all work we need to: > > * Introduce special "snapshot" topics. > > * Require sequence numbers in data to be handled this > > way > > * Require the consumer to queue updates > > * Require the consumer and subscriber to coordinate to > > make sure > > everything went smoothly. > > * Require the subscriber to follow a strict order in > > establishing subscriptions. > > Or, we could just tell the NP in one operation to send > > snapshot > > followed by updates. Then we just need a reliable connection > > between > > the NP and the Consumer. This seems simpler. > > > > > > Ok, I agree. I think we agree that this can be accomplished with the > > specifications as they are (by either associating special semantics > > with > > certain topics or asking for the semantics in the subscribe policy)? > > > > /Sam > > > > > > I have no doubt that a sequence number/queuing pattern would > > be needed > > behinds the scenes in some brokered or indirect > > implementations, but I > > don't believe that it needs to become visible to > > garden-variety > > Subscribers and Consumers. > > > > > > > > Samuel Meder wrote: > > > > On Tue, 2004-11-30 at 14:28 -0500, David Hull wrote: > > > > > > How does the consumer tell the difference between > > * Snapshot arrived ahead of first update. > > * There is no first update because there's > > no traffic. > > > > > > Thinking about it, I'm pretty sure we still have a > > misunderstanding/unstated assumption somewhere (either > > that or I am > > being particularly dense). In any case here is what I > > was going to reply > > with: > > > > There is no way to tell the difference, but is the > > difference really > > detectable in any scenario? There is always going to be > > some time > > between the state change on the service side and the > > consumer being > > aware of that state change, correct? This basically > > means that a > > consumer can never tell whether the state it is > > currently aware of is > > really up to date, ie there is no way of telling whether > > a state change > > notification has happened and the notification is in > > transit or if no > > state change has occurred. > > > > I'm guessing that this is not what you were trying to > > get at though? > > > > /Sam > > > > > > > > Samuel Meder wrote: > > > > > > On Tue, 2004-11-30 at 13:37 -0500, David > > Hull wrote: > > > > > > > > Comments in-line. > > > > Samuel Meder wrote: > > > > > > > > On Mon, 2004-11-29 at 11:35 > > -0500, David Hull wrote: > > > > > > > > > > Could you work through in > > detail how this would > > happen? > > > > I believe we're up against > > an egg-unscrambling > > problem here. On the > > one hand, there will be > > cases when the NP knows > > exactly which updates > > came after which snapshot. > > A common case would be an > > NP which locks > > the database for updates > > during a snapshot. It > > will be able to tell > > which updates came in > > before the lock and which > > came after. It can > > therefore easily send the > > snapshot to the consumer, > > followed by > > exactly the right updates. > > > > Here are the problems I > > see under the current > > setup: > > * There is no way to > > get a snapshot, per se. > > If the updates are > > incremental, which > > they may legitimately be, > > then "last > > message" and > > "current state" are two > > different things. If I > > understand your > > proposal, you're saying > > that there could be > > parallel sets of > > "current state" and > > "update" topics. Fair > > enough. > > * The result of > > getCurrentMessage goes to > > the subscriber, not to > > the consumer. The > > behavior we want is for > > the consumer to get > > the snapshot > > followed by (all and only) > > the relevant updates. > > Under the current > > setup, either the > > subscriber would have to > > forward the result > > to the subscriber, or the > > consumer would > > have to make the > > getCurrentMessage call > > directly (perhaps > > having learned the > > NPs address from a > > previous update, > > assuming one has > > come in). > > > > > > > > > > As far as I can tell there are > > multiple ways of doing this: > > > > 1) Make use of the WS-Addressing > > Reply-To feature to redirect the > > result > > of a getCurrentMessage to the > > consumer > > > > > > > > > > This does not eliminate the race > > condition. At the very least, both > > update and snapshot must be sequence > > numbered, the consumer must be > > prepared to queue updates until the > > snapshot arrives, and the consumer > > does not know for sure that the > > snapshot is valid with respect to the > > updates until the first update > > arrives. > > > > In short, the subscriber and consumer > > have to collaborate to recover > > information that the NP knows, but > > cannot give out directly under the > > current protocol. > > > > > > > > 2) As you mention, the consumer > > could obtain the NP address via > > some > > (out-of-band?) means > > > > > > > > > > This only solves part of the problem. > > > > The comparison is between: > > * Subscriber tells NP: Send this > > consumer a snapshot of the > > current state we're interested > > in, followed by all and only > > those updates after the > > snapshot. We're done. > > and multi-part solutions in which the > > Subscriber and Consumer must > > cooperate to figure out which updates > > to discard and to verify that > > the whole process worked correctly at > > all. There is a qualitative > > difference between these two > > approaches. > > > > With all due respect, I have not yet > > heard from anyone a detailed and > > provably correct procedure for > > eliminating this race condition, using > > any combination of the existing BaseN > > primitives and any composable > > mechanism for transactionality or > > such. I hear "you could do it > > with . . .", but I don't hear how you > > could do it. I believe I have > > given detailed reasons why such a > > solution probably doesn't exist. > > Even if it did exist, it would suffer > > from too many connections > > between too many moving parts. > > > > > > > > Maybe I'm just not seeing the problem. > > Assume both the snapshots and the > > updates contain a related sequence number, > > what I mean by that is that > > the sequence number for the latest snapshot > > is equal to the sequence > > number in the latest update. This also > > assumes that snapshots are always > > up to date with respect to updates. > > > > Where is the race condition in the following > > sequence: > > > > 1) subscriber subscribes consumer to update > > topic > > 2) subscriber does a getCurrentMessage with > > a redirect to consumer (for > > example, any other methods for getting the > > current state to the consumer > > should work). > > > > The consumer should buffer any updates until > > it gets the current state > > and can then discard any updates with > > sequence numbers < sequence number > > in current state. > > > > It's quite possible that I will soon have a > > embarrassing a-ha > > experience, but I currently don't see > > anything racy in that. > > > > > > > > > > What we need is a way to just ask the > > NP for what we want up front. > > One way would be to bake "snapshot" > > and "update" modifiers directly > > into the Subscribe request, but this > > is not the only way. > > > > > > > > 3) You could make the semantics > > of the "current state" topic > > such that a > > subscription to it will trigger > > a single notification of the > > current > > state (ok, I admit that this is > > really stretching it). > > > > > > > > > > That would provide for a snapshot (as > > opposed to current message as > > such). It doesn't solve the race > > condition directly, but a variant > > would. We could always allow for > > modifiers in the topic expression. > > That is, you could subscribe to "Foo", > > or to "Foo(snapshot)" or to > > "Foo(snapshot-and-updates)". If we go > > that way, we should consider > > having the modifiers be an explicit > > part of the topic expression, and > > not baked into topic names in some > > arbitrary way. > > > > Another option would be to put > > snapshot/update modifiers in the > > "subscribe policy" open content. > > > > > > > > Right, as a sort of QoS qualifier for the > > subscription. That would make > > sense to me as well. > > > > /Sam > > > > > > > > > > In all three cases, the NP should be > > able to advertise what it > > supports, and MUST fault on a request > > for something it doesn't > > support. > > > > > > As long as all the > > > > > > > > > > > > * It is decidedly > > non-trivial to handle > > races between the > > snapshot and the > > update stream. At the > > minimum, both need to > > be tagged with > > timestamps (or better, > > sequence numbers). But > > even this doesn't > > seem sufficient, > > particularly since I don't > > know whether the > > snapshot is in sync with > > the updates until I > > actually get an > > update, which could be > > arbitrarily long. > > > > > > > > > > One way to deal with this is to > > subscribe for updates and then > > get the > > snapshot (I am assuming that the > > snapshot kept by the NP is > > always up to > > date wrt to updates, otherwise > > this whole thing would not be > > workable > > unless you had some way to pull > > past updates as well). > > > > > > > > > > > > In short, we need a way of > > explicitly associating the > > snapshot request > > with the request for > > updates and making sure > > they both go to the same > > endpoint, in the right > > order. We would like to > > do this in a way that > > requires no processing by > > the consumer. One of the > > key features of > > WSN is that consumers can > > be dumb. The consumer > > should not have to > > buffer updates and check > > sequence numbers. > > > > > > > > > > I think that consumers at some > > level (and this may be at the > > framework > > level) will have to worry about > > sequencing in any scenario that > > involves > > updates and asynchronous > > messaging. > > > > /Sam > > > > > > > > > > > > It should not have to > > make the update > > subscription itself. And, > > as > > far as I can tell, there > > is no need for it to do > > so, at least in the > > plausible case that the NP > > already knows how to > > provide exactly the > > right information. > > > > Samuel Meder wrote: > > > > > > > > > > Another way of doing > > this that does not > > require WSRF-RP is > > to model your > > topics in a way that > > allows for this: A > > top level topic for > > the whole > > document (which you > > can call > > getCurrentMessage() > > on) and sub-topic > > for > > fields in the > > document (which you > > would subscribe to), > > or even just a > > single "update" > > sub-topic. That in > > combination with > > appropriate use of > > timestamps should be > > able to address your > > problem. > > > > /Sam > > > > On Tue, 2004-11-23 > > at 16:43 -0500, > > David Hull wrote: > > > > > > > > > > > > Steve Graham > > wrote: > > > > > > > > > > David > > Hull > > <dmh@tibco.com> > > wrote > > on > > 11/22/2004 > > 04:48:34 PM: > > > > > > > > > > > > One > > useful pub/sub paradigm involves the concept of notifications > > > > > > > > > > as > > updates > > to > > > > > > > > > > some > > collection of state. > > > > > > > > > > Indeed, > > this is > > what > > WSRF-Resource > > Properties suggests. > > > > > > > > > > > > In > > such > > cases, it is useful to be able to take a > > snapshot > > of the state, then be notified of updates to that state. > > > > > > > > > > Agreed. > > GetResourcePropertyDocument, > > followed by a subscribe > > operation > > specifying > > which > > Resource > > properties > > (bits of state) the consumer should > > receive > > value > > change > > notifications. > > > > > > > > > > > > Minus whatever > > changes > > happened > > between the > > get and the > > subscribe. Or > > plus whatever > > changes > > happened > > between the > > subscribe and > > the get. In > > any case, we > > want to be > > able to cover > > snapshot/update > > scenarios where > > the state is > > not presented > > as a > > WS-Resource. > > Or at least I > > would like > > that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]