ws-rx message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
- From: Doug Davis <dug@us.ibm.com>
- To: ws-rx@lists.oasis-open.org
- Date: Thu, 1 Sep 2005 08:46:55 -0400
Just to make sure its said yet again,
the proposal does not suggest a mechanism through which gaps can be filled.
A solution to fill gaps is a totally
different issue.
-Doug
"Marc Goodner" <mgoodner@microsoft.com>
wrote on 09/01/2005 02:45:10 AM:
> Jaques, you said:
> “The general use case is the one where gaps
exist and persist and for a variety of
> reason, were not / could not be filled at the time the sequence is
no longer to be
> used (for whatever reason) and needs to be disposed of.”
>
> In following the many proposals being made here
it continued to strike me that what
> you are looking for is a way to fill gaps so it is nice to see that
confirmed.
> Couldn’t filling gaps in a sequence be done in a much simpler manner
than the
> current proposal?
>
>
> From: Jacques Durand [mailto:JDurand@us.fujitsu.com]
> Sent: Wednesday, August 31, 2005 1:24 PM
> To: Stefan Batres; Doug Davis; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
>
> 2 comments Inline <JD>
>
>
> From: Stefan Batres [mailto:stefanba@microsoft.com]
> Sent: Tuesday, August 30, 2005 10:48 PM
> To: Doug Davis; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
>
> Doug,
>
> You mention a specific situation: An RMD experiences
a failure that prevents it
> from receiving application messages. I agree in so far as saying that
in such a
> failure case this proposal could be helpful in that it helps the RMS
to engage in
> recovery of some sort (either inform applications that a specific
message was not
> sent or open a new sequence, assuming ordering is not important).
But this is not
> the only failure case that applications will want to deal with (with
or without
> help from the protocol).
> Consider the case where connectivity is lost
for long enough for both sequences to
> expire or consider the case where the destination suffers a loss of
session state.
> In such failure modes this solution is not helpful - yet applications
will need a
> recovery strategy of some sort. It might be that it is application
specific, or it
> might be that a general failure recovery specification is created
and ratified at
> some point. The important idea is that the only way to deal with all
failure modes
> is at higher level. This proposal leverages the protocol to optimize
recovery in
> specific circumstances that should be relatively rare. RM implementations
should
> not be required to support failure mode recovery mechanisms that either
don't apply
> to them or that they choose to implement in a uniform way at a higher
level.
>
> <JD> I do not see better recovery as the
main driver behind resolving i019 - though
> enhanced recovery can certainly be a byproduct of it, yet in no different
way than
> say the recovery made possible by the mechanisms behind AtLeastOnce
DA ("... or
> else an error will be raised on at least one endpoint"). Such
error-raising is
> serving a purpose, whatever usage is made of these "errors",
(and indeed in many
> cases they require application-level handling as you said -
sometimes also just
> application awareness may have great value). But just because of this,
we want
> errors to be raised as accurately as possible. I believe the proposal
for i019
> allows for achieving greater awareness of delivery failure on RMS
/ AS side at no
> greater cost, and that applies not just to I019 but to i028 as well,
where the
> sequence is not faulted. The general use case is the one where gaps
exist and
> persist and for a variety of reason, were not / could not be filled
at the time the
> sequence is no longer to be used (for whatever reason) and needs to
be disposed of.
>
> Thanks
>
> --Stefan
>
>
>
> From: Doug Davis [mailto:dug@us.ibm.com]
> Sent: Tuesday, August 30, 2005 1:08 PM
> To: ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
>
>
> Yet more comments. :-)
> -Doug
>
> "Stefan Batres" <stefanba@microsoft.com>
> 08/30/2005 03:35 PM
>
> To
>
> Doug Davis/Raleigh/IBM@IBMUS, <ws-rx@lists.oasis-open.org>
>
> cc
>
>
>
> Subject
>
> RE: [ws-rx] i0019 - a formal proposal - take 2
>
>
>
>
>
>
>
>
>
>
> Doug,
>
> Some more comments and thoughts on your proposal:
>
>
> <dug>... When or why an RMS uses CloseSequence is up to it to
decide.
> All we know is that it wants to shut things down and get an accurate
ACK from the RMD.</dug>
>
> I still have not heard of a plausible reason why an RMS "wants
to shut things down"
> and the current spec presents a problem. Comparing the spec as it
stands today vs.
> the spec + this proposal:
>
> TODAY: RMS wants to end the sequence so it sends
a LastMessage and must wait for a
> complete set of acks; this might require retransmitting messages.
Once a full set
> of acks is received RMS sends TerminateSequence.
>
> TODAY + THIS PROPOSAL: RMS wants to end the sequence
so it sends Close, waits for a
> CloseResponse, possibly retransmitting the Close. Once a CloseResponse
is received
> RMS sends TerminateSequence.
>
> The problem with the TODAY scenario, as I've heard it in this forum,
is that the
> RMS might have to wait unacceptably long between sending LastMessage
and getting a
> full ack range. But if getting some messages or acks across proves
difficult; why
> would the RMS expect that getting Close across would be any easier?
> <JD> Some messages may not have made it
to RMD for various reasons that do not
> necessarily apply to the Close op. You may also have the option of
resending the
> Close op in a way (say over 24h) that you could not afford to do on
large scale via
> a policy that has to apply to all regular messages, due to network
bandwidth or due
> to the time-bound value of these messages (message may loose value
if untimely -
> yet RMS and AS want to be sure which ones were lost) . So even
a delayed closing
> still have value for accuracy of acknowledgements.
>
> <dug> 1 - I don't believe your text is accurate in that Close
is supposed to be
> used in cases where the sequence needs to end due to something going
wrong. You've
> described a case where the sequence is functioning just fine - and
while Close can
> be used in those cases as well, it provides no additional value. 2-
Sending a
> Close and sending application data can have quite a different set
of features
> executed so I don't think its hard to imagine cases where RM messages
can get
> processed just fine but application messages run into problems. I
believe Chris
> mentioned on some call the notion of two different persistent stores
- one for RM
> data and one for app-data. Its possible that the app-data one
is running into
> problems. 3 - Using the CloseSequence operation is option -
if you feel that, as
> an RMS implementor, you'll never see its usefulness then you're free
to never
> implement/send it. However, I'd hate remove this option for
those of us who do see
> value in it. </dug>
>
>
>
> <dug>The case that I keep thinking about is one where the RMD
is actually a cluster
> of machines and when a sequence gets created it has an affinity to
a certain server
> in the cluster - meaning it processes all of the messages for that
sequence. If
> that server starts to have problems, and for some reason it just can't
seem to
> process any new app messages then the RMS can close down the sequence
and start up
> a new one. Hopefully, the new sequence will be directed to a different
server in
> the cluster. </dug>
>
> There are two problems with this scenario and the proposed solution.
> 1. If an RMD has sequence-to-machine affinity
that should be strictly the RMDs
> decision and the RMDs problem. The RMS is autonomous; this proposal
puts
> expectations on the RMS' behavior based on particularities of the
RMD
> implementation. To be clear, I'll note that affinity can be achieved
in two ways:
>
i.
By performing
> stateful routing at the RMD; basically the RMD has to remember every
active
> sequence and what machine it has affinity to. In this case it would
be simple to
> change the RMD's routing table when a machine fails.
>
ii.
By generating
> different EPR's for each machine. For affinity to function this way
two things are necessary:
> 1. Some sort of endpoint resolution mechanism
would have to be devised for the
> RMS to learn the EPR that it should target.
> 2. A mechanism for migrating that EPR.
> Clearly 1) and 2) are outside the scope of the
TC and, in my view, this proposal
> might be defining 2) in an informal way that is specific to WS-RM.
>
> 2. If the RMS somehow guesses that there is a
problem on the EPR to which it
> is sending its messages and somehow decides that Closing the sequence
and starting
> a new one is the right course of action, ordering guarantees are compromised.
>
> <dug> I probably didn't state the problem very well. I
didn't intend to claim that
> the RMS knew about this affinity, but instead it knew that something
was wrong with
> the current sequence and in order to try to fix the situation it decided
to try
> another sequence. The affinity bit was thrown in there to explain
why starting a
> new sequence _might_ fix the problem.
>
> I should also point out that while a lot of these discussions have
focused on
> InOrder+ExactlyOnce DA, this feature is still useful in other DAs.
For example, if
> the DA is just ExactlyOnce - having an accurate accounting of the
ACKs allows a
> subsequent sequence to send just the gaps from the first, so getting
an accurate
> list of the gaps becomes critical. And this of course leads
us to the discussion
> of how to determine the DA in use - which I think might be part of
issues 6, 9, 24 and 27.
> </dug>
>
> Finally, I agree with you that considering a gap-filling mechanism
would be a good
> thing for this TC to do.
>
>
> --Stefan
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]