wsn message

Subject: Re: [wsn] Subscribe and GetCurrentMessage
From: Samuel Meder <meder@mcs.anl.gov>
To: David Hull <dmh@tibco.com>
Date: Tue, 30 Nov 2004 15:53:25 -0600
On Tue, 2004-11-30 at 16:15 -0500, David Hull wrote:
> I think part of the confusion is that we've managed to solve part of
> the problem over the course of the discussion.
> 
> There is a very basic race condition to avoid.  We have to be sure
> that the NP doesn't get a snapshot request until after the
> subscription for updates has arrived.  We can get around that by
> having the Subscriber subscribe first, and only make the snapshot
> request after it receives a response from the subscription.  We then
> only have to assume that the NP will not send back a subscription
> response until after it has established the subscription.  This seems
> reasonable, but we should talk about it explicitly -- a brokered or
> indirect implementation may not be able to make quite that guarantee.
> 
> Given this, we have to make sure the consumer knows which updates are
> applicable.  This is solved by putting a sequence number in each
> update, and making sure that each snapshot contains the sequence
> number of the update the most current update.  The consumer then
> queues updates (per topic) until it receives a snapshot, and discards
> those that predate the snapshot.  We assume that the consumer knows to
> do this, presumably by some out-of-band communication between it and
> the Subscriber.
> 
> The scenario I described, of not being able to tell late updates from
> no updates, is only relevant if the Consumer doesn't trust the
> Subscriber to do things in the right order.  If the Consumer knows
> that the Subscriber doesn't ask for the snapshot until it gets a valid
> SubscribeResponse, then the arrival of a snapshot guarantees that any
> relevant updates will arrive (possibly preceded by some irrelevant
> ones).
> 
> However, there's still at least one remaining question: How does the
> subscriber know the snapshot arrived?  The NP doesn't necessarily
> know, so there would have to be some positive acknowledgment from the
> Consumer to the Subscriber.  Otherwise, the Subscriber might have to
> retry, and the Consumer would have to ignore duplicate snapshots.

I think you'd have the same problem with any of the update messages as
well, although I agree that sending the snapshots and the updates over
the same channel makes things easier.

> So to make this all work we need to:
>       * Introduce special "snapshot" topics.
>       * Require sequence numbers in data to be handled this way
>       * Require the consumer to queue updates
>       * Require the consumer and subscriber to coordinate to make sure
>         everything went smoothly.
>       * Require the subscriber to follow a strict order in
>         establishing subscriptions.
> Or, we could just tell the NP in one operation to send snapshot
> followed by updates.  Then we just need a reliable connection between
> the NP and the Consumer.  This seems simpler.

Ok, I agree. I think we agree that this can be accomplished with the
specifications as they are (by either associating special semantics with
certain topics or asking for the semantics in the subscribe policy)?

/Sam

> I have no doubt that a sequence number/queuing pattern would be needed
> behinds the scenes in some brokered or indirect implementations, but I
> don't believe that it needs to become visible to garden-variety
> Subscribers and Consumers.
> 
> 
> 
> Samuel Meder wrote:
> > On Tue, 2004-11-30 at 14:28 -0500, David Hull wrote:
> >   
> > > How does the consumer tell the difference between 
> > >       * Snapshot arrived ahead of first update. 
> > >       * There is no first update because there's no traffic. 
> > >     
> > 
> > Thinking about it, I'm pretty sure we still have a
> > misunderstanding/unstated assumption somewhere (either that or I am
> > being particularly dense). In any case here is what I was going to reply
> > with:
> > 
> > There is no way to tell the difference, but is the difference really
> > detectable in any scenario? There is always going to be some time
> > between the state change on the service side and the consumer being
> > aware of that state change, correct? This basically means that a
> > consumer can never tell whether the state it is currently aware of is
> > really up to date, ie there is no way of telling whether a state change
> > notification has happened and the notification is in transit or if no
> > state change has occurred.
> > 
> > I'm guessing that this is not what you were trying to get at though?
> > 
> > /Sam
> > 
> >   
> > > Samuel Meder wrote: 
> > >     
> > > > On Tue, 2004-11-30 at 13:37 -0500, David Hull wrote:
> > > >   
> > > >       
> > > > > Comments in-line.
> > > > > 
> > > > > Samuel Meder wrote:
> > > > >     
> > > > >         
> > > > > > On Mon, 2004-11-29 at 11:35 -0500, David Hull wrote:
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > > > Could you work through in detail how this would happen?
> > > > > > > 
> > > > > > > I believe we're up against an egg-unscrambling problem here.  On the
> > > > > > > one hand, there will be cases when the NP knows exactly which updates
> > > > > > > came after which snapshot.  A common case would be an NP which locks
> > > > > > > the database for updates during a snapshot.  It will be able to tell
> > > > > > > which updates came in before the lock and which came after.  It can
> > > > > > > therefore easily send the snapshot to the consumer, followed by
> > > > > > > exactly the right updates.
> > > > > > > 
> > > > > > > Here are the problems I see under the current setup:
> > > > > > >       * There is no way to get a snapshot, per se.  If the updates are
> > > > > > >         incremental, which they may legitimately be, then "last
> > > > > > >         message" and "current state" are two different things.  If I
> > > > > > >         understand your proposal, you're saying that there could be
> > > > > > >         parallel sets of "current state" and "update" topics.  Fair
> > > > > > >         enough.
> > > > > > >       * The result of getCurrentMessage goes to the subscriber, not to
> > > > > > >         the consumer.  The behavior we want is for the consumer to get
> > > > > > >         the snapshot followed by (all and only) the relevant updates.
> > > > > > >         Under the current setup, either the subscriber would have to
> > > > > > >         forward the result to the subscriber, or the consumer would
> > > > > > >         have to make the getCurrentMessage call directly (perhaps
> > > > > > >         having learned the NPs address from a previous update,
> > > > > > >         assuming one has come in).
> > > > > > >     
> > > > > > >         
> > > > > > >             
> > > > > > As far as I can tell there are multiple ways of doing this:
> > > > > > 
> > > > > > 1) Make use of the WS-Addressing Reply-To feature to redirect the result
> > > > > > of a getCurrentMessage to the consumer
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > This does not eliminate the race condition.  At the very least, both
> > > > > update and snapshot must be sequence numbered, the consumer must be
> > > > > prepared to queue updates until the snapshot arrives, and the consumer
> > > > > does not know for sure that the snapshot is valid with respect to the
> > > > > updates until the first update arrives.
> > > > > 
> > > > > In short, the subscriber and consumer have to collaborate to recover
> > > > > information that the NP knows, but cannot give out directly under the
> > > > > current protocol.
> > > > >     
> > > > >         
> > > > > > 2) As you mention, the consumer could obtain the NP address via some
> > > > > > (out-of-band?) means
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > This only solves part of the problem.
> > > > > 
> > > > > The comparison is between:
> > > > >       * Subscriber tells NP: Send this consumer a snapshot of the
> > > > >         current state we're interested in, followed by all and only
> > > > >         those updates after the snapshot.  We're done.
> > > > > and multi-part solutions in which the Subscriber and Consumer must
> > > > > cooperate to figure out which updates to discard and to verify that
> > > > > the whole process worked correctly at all.  There is a qualitative
> > > > > difference between these two approaches.
> > > > > 
> > > > > With all due respect, I have not yet heard from anyone a detailed and
> > > > > provably correct procedure for eliminating this race condition, using
> > > > > any combination of the existing BaseN primitives and any composable
> > > > > mechanism for transactionality or such.  I hear "you could do it
> > > > > with . . .", but I don't hear how you could do it.  I believe I have
> > > > > given detailed reasons why such a solution probably doesn't exist.
> > > > > Even if it did exist, it would suffer from too many connections
> > > > > between too many moving parts.
> > > > >     
> > > > >         
> > > > Maybe I'm just not seeing the problem. Assume both the snapshots and the
> > > > updates contain a related sequence number, what I mean by that is that
> > > > the sequence number for the latest snapshot is equal to the sequence
> > > > number in the latest update. This also assumes that snapshots are always
> > > > up to date with respect to updates.
> > > > 
> > > > Where is the race condition in the following sequence:
> > > > 
> > > > 1) subscriber subscribes consumer to update topic
> > > > 2) subscriber does a getCurrentMessage with a redirect to consumer (for
> > > > example, any other methods for getting the current state to the consumer
> > > > should work).
> > > > 
> > > > The consumer should buffer any updates until it gets the current state
> > > > and can then discard any updates with sequence numbers < sequence number
> > > > in current state.
> > > > 
> > > > It's quite possible that I will soon have a embarrassing a-ha
> > > > experience, but I currently don't see anything racy in that. 
> > > > 
> > > >   
> > > >       
> > > > > What we need is a way to just ask the NP for what we want up front.
> > > > > One way would be to bake "snapshot" and "update" modifiers directly
> > > > > into the Subscribe request, but this is not the only way.
> > > > >     
> > > > >         
> > > > > > 3) You could make the semantics of the "current state" topic such that a
> > > > > > subscription to it will trigger a single notification of the current
> > > > > > state (ok, I admit that this is really stretching it).
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > That would provide for a snapshot (as opposed to current message as
> > > > > such).  It doesn't solve the race condition directly, but a variant
> > > > > would.  We could always allow for modifiers in the topic expression.
> > > > > That is, you could subscribe to "Foo", or to "Foo(snapshot)" or to
> > > > > "Foo(snapshot-and-updates)".  If we go that way, we should consider
> > > > > having the modifiers be an explicit part of the topic expression, and
> > > > > not baked into topic names in some arbitrary way.
> > > > > 
> > > > > Another option would be to put snapshot/update modifiers in the
> > > > > "subscribe policy" open content.
> > > > >     
> > > > >         
> > > > Right, as a sort of QoS qualifier for the subscription. That would make
> > > > sense to me as well.
> > > > 
> > > > /Sam
> > > > 
> > > >   
> > > >       
> > > > > In all three cases, the NP should be able to advertise what it
> > > > > supports, and MUST fault on a request for something it doesn't
> > > > > support.
> > > > > 
> > > > > 
> > > > > As long as all the 
> > > > >     
> > > > >         
> > > > > >                   
> > > > > > >       * It is decidedly non-trivial to handle races between the
> > > > > > >         snapshot and the update stream.  At the minimum, both need to
> > > > > > >         be tagged with timestamps (or better, sequence numbers).   But
> > > > > > >         even this doesn't seem sufficient, particularly since I don't
> > > > > > >         know whether the snapshot is in sync with the updates until I
> > > > > > >         actually get an update, which could be arbitrarily long.
> > > > > > >     
> > > > > > >         
> > > > > > >             
> > > > > > One way to deal with this is to subscribe for updates and then get the
> > > > > > snapshot (I am assuming that the snapshot kept by the NP is always up to
> > > > > > date wrt to updates, otherwise this whole thing would not be workable
> > > > > > unless you had some way to pull past updates as well).
> > > > > > 
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > > > In short, we need a way of explicitly associating the snapshot request
> > > > > > > with the request for updates and making sure they both go to the same
> > > > > > > endpoint, in the right order.  We would like to do this in a way that
> > > > > > > requires no processing by the consumer.  One of the key features of
> > > > > > > WSN is that consumers can be dumb.  The consumer should not have to
> > > > > > > buffer updates and check sequence numbers. 
> > > > > > >     
> > > > > > >         
> > > > > > >             
> > > > > > I think that consumers at some level (and this may be at the framework
> > > > > > level) will have to worry about sequencing in any scenario that involves
> > > > > > updates and asynchronous messaging.
> > > > > > 
> > > > > > /Sam
> > > > > > 
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > > >  It should not have to make the update subscription itself.  And, as
> > > > > > > far as I can tell, there is no need for it to do so, at least in the
> > > > > > > plausible case that the NP already knows how to provide exactly the
> > > > > > > right information.
> > > > > > > 
> > > > > > > Samuel Meder wrote:
> > > > > > >     
> > > > > > >         
> > > > > > >             
> > > > > > > > Another way of doing this that does not require WSRF-RP is to model your
> > > > > > > > topics in a way that allows for this: A top level topic for the whole
> > > > > > > > document (which you can call getCurrentMessage() on) and sub-topic for
> > > > > > > > fields in the document (which you would subscribe to), or even just a
> > > > > > > > single "update" sub-topic. That in combination with appropriate use of
> > > > > > > > timestamps should be able to address your problem.
> > > > > > > > 
> > > > > > > > /Sam
> > > > > > > > 
> > > > > > > > On Tue, 2004-11-23 at 16:43 -0500, David Hull wrote:
> > > > > > > >   
> > > > > > > >       
> > > > > > > >           
> > > > > > > >               
> > > > > > > > > Steve Graham wrote:
> > > > > > > > >     
> > > > > > > > >         
> > > > > > > > >             
> > > > > > > > >                 
> > > > > > > > > > David Hull <dmh@tibco.com> wrote on 11/22/2004 04:48:34 PM:
> > > > > > > > > > 
> > > > > > > > > >       
> > > > > > > > > >           
> > > > > > > > > >               
> > > > > > > > > >                   
> > > > > > > > > > > One useful pub/sub paradigm involves the concept of notifications
> > > > > > > > > > >         
> > > > > > > > > > >             
> > > > > > > > > > >                 
> > > > > > > > > > >                     
> > > > > > > > > > as updates to 
> > > > > > > > > >       
> > > > > > > > > >           
> > > > > > > > > >               
> > > > > > > > > >                   
> > > > > > > > > > > some collection of state. 
> > > > > > > > > > >         
> > > > > > > > > > >             
> > > > > > > > > > >                 
> > > > > > > > > > >                     
> > > > > > > > > > Indeed, this is what WSRF-Resource Properties suggests. 
> > > > > > > > > > 
> > > > > > > > > >       
> > > > > > > > > >           
> > > > > > > > > >               
> > > > > > > > > >                   
> > > > > > > > > > > In such cases, it is useful to be able to take a 
> > > > > > > > > > > snapshot of the state, then be notified of updates to that state. 
> > > > > > > > > > >         
> > > > > > > > > > >             
> > > > > > > > > > >                 
> > > > > > > > > > >                     
> > > > > > > > > > Agreed.  GetResourcePropertyDocument, followed by a subscribe
> > > > > > > > > > operation specifying 
> > > > > > > > > > which Resource properties (bits of state) the consumer should
> > > > > > > > > > receive value change 
> > > > > > > > > > notifications. 
> > > > > > > > > >       
> > > > > > > > > >           
> > > > > > > > > >               
> > > > > > > > > >                   
> > > > > > > > > Minus whatever changes happened between the get and the subscribe.  Or
> > > > > > > > > plus whatever changes happened between the subscribe and the get.  In
> > > > > > > > > any case, we want to be able to cover snapshot/update scenarios where
> > > > > > > > > the state is not presented as a WS-Resource.  Or at least I would like
> > > > > > > > > that.
> > > > > > > > > 
> > > > > > > > >     
> > > > > > > > >         
> > > > > > > > >             
> > > > > > > > >                 
> > > > > > > >                   
> > > > > > > >               
> > > > > >         
> > > > > >           
> > > >   
> > > >       
> > 
> > 
> >   
>
References:
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: Steve Graham <sggraham@us.ibm.com>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: David Hull <dmh@tibco.com>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: Samuel Meder <meder@mcs.anl.gov>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: David Hull <dmh@tibco.com>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: Samuel Meder <meder@mcs.anl.gov>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: David Hull <dmh@tibco.com>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: Samuel Meder <meder@mcs.anl.gov>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: David Hull <dmh@tibco.com>
- Re: [wsn] Subscribe and GetCurrentMessage
  - From: David Hull <dmh@tibco.com>