OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-rx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [ws-rx] i0019 - a formal proposal - take 2



Just to make sure its said yet again, the proposal does not suggest a mechanism through which gaps can be filled.
A solution to fill gaps is a totally different issue.
-Doug


"Marc Goodner" <mgoodner@microsoft.com> wrote on 09/01/2005 02:45:10 AM:

> Jaques, you said:

> “The general use case is the one where gaps exist and persist and for a variety of
> reason, were not / could not be filled at the time the sequence is no longer to be
> used (for whatever reason) and needs to be disposed of.”

>  
> In following the many proposals being made here it continued to strike me that what
> you are looking for is a way to fill gaps so it is nice to see that confirmed.
> Couldn’t filling gaps in a sequence be done in a much simpler manner than the
> current proposal?

>  
>
> From: Jacques Durand [mailto:JDurand@us.fujitsu.com]
> Sent: Wednesday, August 31, 2005 1:24 PM
> To: Stefan Batres; Doug Davis; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

>  
> 2 comments Inline <JD>
>  
>
> From: Stefan Batres [mailto:stefanba@microsoft.com]
> Sent: Tuesday, August 30, 2005 10:48 PM
> To: Doug Davis; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

>  
> Doug,
>  
> You mention a specific situation: An RMD experiences a failure that prevents it
> from receiving application messages. I agree in so far as saying that in such a
> failure case this proposal could be helpful in that it helps the RMS to engage in
> recovery of some sort (either inform applications that a specific message was not
> sent or open a new sequence, assuming ordering is not important). But this is not
> the only failure case that applications will want to deal with (with or without
> help from the protocol).

> Consider the case where connectivity is lost for long enough for both sequences to
> expire or consider the case where the destination suffers a loss of session state.
> In such failure modes this solution is not helpful - yet applications will need a
> recovery strategy of some sort. It might be that it is application specific, or it
> might be that a general failure recovery specification is created and ratified at
> some point. The important idea is that the only way to deal with all failure modes
> is at higher level. This proposal leverages the protocol to optimize recovery in
> specific circumstances that should be relatively rare. RM implementations should
> not be required to support failure mode recovery mechanisms that either don't apply
> to them or that they choose to implement in a uniform way at a higher level.

>  
> <JD> I do not see better recovery as the main driver behind resolving i019 - though
> enhanced recovery can certainly be a byproduct of it, yet in no different way than
> say the recovery made possible by the mechanisms behind AtLeastOnce DA  ("... or
> else an error will be raised on at least one endpoint"). Such error-raising is
> serving a purpose, whatever usage is made of these "errors",  (and indeed in many
> cases they require application-level  handling as you said - sometimes also just
> application awareness may have great value). But just because of this, we want
> errors to be raised as accurately as possible. I believe the proposal for i019
> allows for achieving greater awareness of delivery failure on RMS / AS side at no
> greater cost, and that applies not just to I019 but to i028 as well, where the
> sequence is not faulted. The general use case is the one where gaps exist and
> persist and for a variety of reason, were not / could not be filled at the time the
> sequence is no longer to be used (for whatever reason) and needs to be disposed of.

>                                                                                  
> Thanks
>  
> --Stefan
>  
>  
>
> From: Doug Davis [mailto:dug@us.ibm.com]
> Sent: Tuesday, August 30, 2005 1:08 PM
> To: ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

>  
>
> Yet more comments. :-)
> -Doug

>
> "Stefan Batres" <stefanba@microsoft.com>

> 08/30/2005 03:35 PM
>
> To

>
> Doug Davis/Raleigh/IBM@IBMUS, <ws-rx@lists.oasis-open.org>

>
> cc

>
>  

>
> Subject

>
> RE: [ws-rx] i0019 - a formal proposal - take 2

>
>  

>
>  

>
>  

>
>
>
>
> Doug,
>  
> Some more comments and thoughts on your proposal:
>  
>  
> <dug>... When or why an RMS uses CloseSequence is up to it to decide.
> All we know is that it wants to shut things down and get an accurate ACK from the RMD.</dug>
>  
> I still have not heard of a plausible reason why an RMS "wants to shut things down"
> and the current spec presents a problem. Comparing the spec as it stands today vs.
> the spec + this proposal:
>  

> TODAY: RMS wants to end the sequence so it sends a LastMessage and must wait for a
> complete set of acks; this might require retransmitting messages. Once a full set
> of acks is received RMS sends TerminateSequence.

>  
> TODAY + THIS PROPOSAL: RMS wants to end the sequence so it sends Close, waits for a
> CloseResponse, possibly retransmitting the Close. Once a CloseResponse is received
> RMS sends TerminateSequence.

>
> The problem with the TODAY scenario, as I've heard it in this forum, is that the
> RMS might have to wait unacceptably long between sending LastMessage and getting a
> full ack range. But if getting some messages or acks across proves difficult; why
> would the RMS expect that getting Close across would be any easier?

> <JD> Some messages may not have made it to RMD for various reasons that do not
> necessarily apply to the Close op. You may also have the option of resending the
> Close op in a way (say over 24h) that you could not afford to do on large scale via
> a policy that has to apply to all regular messages, due to network bandwidth or due
> to the time-bound value of these messages (message may loose value if untimely -
> yet RMS and AS  want to be sure which ones were lost) . So even a delayed closing
> still have value for accuracy of acknowledgements.

>
> <dug> 1 - I don't believe your text is accurate in that Close is supposed to be
> used in cases where the sequence needs to end due to something going wrong.  You've
> described a case where the sequence is functioning just fine - and while Close can
> be used in those cases as well, it provides no additional value.  2- Sending a
> Close and sending application data can have quite a different set of features
> executed so I don't think its hard to imagine cases where RM messages can get
> processed just fine but application messages run into problems.  I believe Chris
> mentioned on some call the notion of two different persistent stores - one for RM
> data and one for app-data.  Its possible that the app-data one is running into
> problems.  3 - Using the CloseSequence operation is option - if you feel that, as
> an RMS implementor, you'll never see its usefulness then you're free to never
> implement/send it.  However, I'd hate remove this option for those of us who do see
> value in it.  </dug>
>
>
>  
> <dug>The case that I keep thinking about is one where the RMD is actually a cluster
> of machines and when a sequence gets created it has an affinity to a certain server
> in the cluster - meaning it processes all of the messages for that sequence. If
> that server starts to have problems, and for some reason it just can't seem to
> process any new app messages then the RMS can close down the sequence and start up
> a new one. Hopefully, the new sequence will be directed to a different server in
> the cluster. </dug>
>  
> There are two problems with this scenario and the proposed solution.
> 1.      If an RMD has sequence-to-machine affinity that should be strictly the RMDs
> decision and the RMDs problem. The RMS is autonomous; this proposal puts
> expectations on the RMS' behavior based on particularities of the RMD
> implementation. To be clear, I'll note that affinity can be achieved in two ways:
>                                                         i.            By performing
> stateful routing at the RMD; basically the RMD has to remember every active
> sequence and what machine it has affinity to. In this case it would be simple to
> change the RMD's routing table when a machine fails.
>                                                        ii.            By generating
> different EPR's for each machine. For affinity to function this way two things are necessary:
> 1.      Some sort of endpoint resolution mechanism would have to be devised for the
> RMS to learn the EPR that it should target.
> 2.      A mechanism for migrating that EPR.

> Clearly 1) and 2) are outside the scope of the TC and, in my view, this proposal
> might be defining 2) in an informal way that is specific to WS-RM.

>
> 2.      If the RMS somehow guesses that there is a problem on the EPR to which it
> is sending its messages and somehow decides that Closing the sequence and starting
> a new one is the right course of action, ordering guarantees are compromised.
>
> <dug> I probably didn't state the problem very well.  I didn't intend to claim that
> the RMS knew about this affinity, but instead it knew that something was wrong with
> the current sequence and in order to try to fix the situation it decided to try
> another sequence.  The affinity bit was thrown in there to explain why starting a
> new sequence _might_ fix the problem.
>
> I should also point out that while a lot of these discussions have focused on
> InOrder+ExactlyOnce DA, this feature is still useful in other DAs.  For example, if
> the DA is just ExactlyOnce - having an accurate accounting of the ACKs allows a
> subsequent sequence to send just the gaps from the first, so getting an accurate
> list of the gaps becomes critical.  And this of course leads us to the discussion
> of how to determine the DA in use - which I think might be part of issues 6, 9, 24 and 27.
>  </dug>
>  
> Finally, I agree with you that considering a gap-filling mechanism would be a good
> thing for this TC to do.
>  

>  
> --Stefan
>  
>  


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]