From: Doug Davis
[mailto:dug@us.ibm.com]
Sent: Tuesday, August 30, 2005
8:43 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a
formal proposal - take 2
Lei,
Yes that is true. So there are two thing here: 1) the proposal doesn't
say why the RMS is closing the sequence nor does it say what it will use the
final ACK state for.
<GB> At least when we started, there
was an actual problem we were trying to solve. Now it sounds like the
proposal doesn’t solve it, but we like the proposal anyway for some other
reason, and what I’m reading is that this is all OK as long as we are
careful not to suggest that the proposal solves any particular problem.
This is very important. It
purposely does not get into that area because, as others have stated, things
like linking of sequence should probably be done by some higher level
processing, which is out of scope (as of now anyway).
<GB>
Forget linking of seqences. From the language in proposal 3, (“After
line 396, …”) “…this would leave the RM source
unsure of the final ranges of messages that were delivered to the destination”.
Please explain how, without knowledge of the destination DA, this problem isn’t
still there. Please don’t tell me it doesn’t matter, or that
we’ll deal with it later, in a separate issue, because I don’t know
if we will or not; We will be asked to consider this proposal on its own,
and decide if it solves the problem. At this point I have to conclude
that it does not.
And 2) this problem exists for
other areas of the spec, not just this proposal. Let's say the an RMS
reaches the MaxMessageNumber, as defined by the RMD. The way to resolve
this is to create a new sequence to continue - well how can we guarantee the
ordering will be preserved across the sequences? Same problem. So
this proposal does not introduce a new problem, the problem is already there.
However, that being said, and even though several people have said that
doing things like preserving the order across sequences is something for a
higher-level processing, I do personally believe that the spec should provide
an aide to that processing. But I view that as a separate issue and not
part of this one.
thanks
-Doug
"Lei Jin" <ljin@bea.com>
08/30/2005
08:15 PM
|
To
|
<ws-rx@lists.oasis-open.org>
|
cc
|
|
Subject
|
RE: [ws-rx] i0019 - a formal proposal - take 2
|
|
Let's assume we are using an in-order delivery
assurance. I am sending 10 messages from AS to AD, and after some time, I
decide to send a <close> for which I receive a final ack with (1 - 5).
If I understand correctly, one of the motivations for having a
final ack is so that you know the accurate state of all the received messages,
so that you can decide what to resend later in a new sequence. So let's
say we start up a new reliable sequence and resends messages 6 - 10.
However, here is a problem. How do I preserve the in-order delivery
assurance? How do I guarantee that message 6 will be delivered after
message 5. Note that the final ack only says (1 - 5) is received, not
delivered. It's perfectly reasonable if message 5 is not delivered by the
time another sequence is set up and message 6 arrives. In that case, are
we going to have to worry about delivery assurances across multiple sequences?
Lei
-----Original
Message-----
From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Tuesday, August 30, 2005 12:08 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
Yet more comments. :-)
-Doug
"Stefan Batres"
<stefanba@microsoft.com>
08/30/2005
03:35 PM
|
To
|
Doug Davis/Raleigh/IBM@IBMUS, <ws-rx@lists.oasis-open.org>
|
cc
|
|
Subject
|
RE: [ws-rx] i0019 - a formal proposal - take 2
|
|
Doug,
Some more comments and thoughts on your proposal:
<dug>... When or why an RMS uses CloseSequence is up to it to decide.
All we know is that it wants to shut things down and get an accurate ACK from
the RMD.</dug>
I still have not heard of a plausible reason why an RMS “wants to shut
things down” and the current spec presents a problem. Comparing the spec as
it stands today vs. the spec + this proposal:
- TODAY: RMS wants to end the sequence so it
sends a LastMessage and must wait for a complete set of acks; this might
require retransmitting messages. Once a full set of acks is received RMS
sends TerminateSequence.
- TODAY + THIS PROPOSAL: RMS wants to end the
sequence so it sends Close, waits for a CloseResponse, possibly
retransmitting the Close. Once a CloseResponse is received RMS sends
TerminateSequence.
The problem with the TODAY scenario, as I’ve heard it in this forum, is
that the RMS might have to wait unacceptably long between sending LastMessage
and getting a full ack range. But if getting some messages or acks across
proves difficult; why would the RMS expect that getting Close across would be
any easier?
<dug> 1 - I don't believe your text is accurate in that Close is supposed
to be used in cases where the sequence needs to end due to something going
wrong. You've described a case where the sequence is functioning just
fine - and while Close can be used in those cases as well, it provides no
additional value. 2- Sending a Close and sending application data can
have quite a different set of features executed so I don't think its hard to
imagine cases where RM messages can get processed just fine but application
messages run into problems. I believe Chris mentioned on some call the
notion of two different persistent stores - one for RM data and one for
app-data. Its possible that the app-data one is running into problems. 3
- Using the CloseSequence operation is option - if you feel that, as an RMS
implementor, you'll never see its usefulness then you're free to never
implement/send it. However, I'd hate remove this option for those of us
who do see value in it. </dug>
<dug>The case that I keep thinking about is one where the RMD is actually
a cluster of machines and when a sequence gets created it has an affinity to a
certain server in the cluster - meaning it processes all of the messages for
that sequence. If that server starts to have problems, and for some reason it
just can't seem to process any new app messages then the RMS can close down the
sequence and start up a new one. Hopefully, the new sequence will be directed
to a different server in the cluster. </dug>
There are two problems with this scenario and the proposed solution.
1.
If an RMD has sequence-to-machine affinity that should be
strictly the RMDs decision and the RMDs problem. The RMS is autonomous; this
proposal puts expectations on the RMS’ behavior based on particularities
of the RMD implementation. To be clear, I’ll note that affinity can be
achieved in two ways:
i. By
performing stateful routing at the RMD; basically the RMD has to remember every
active sequence and what machine it has affinity to. In this case it would be
simple to change the RMD’s routing table when a machine fails.
ii. By
generating different EPR’s for each machine. For affinity to function
this way two things are necessary:
1.
Some sort of endpoint resolution mechanism would have to
be devised for the RMS to learn the EPR that it should target.
2.
A mechanism for migrating that EPR.
Clearly
1) and 2) are outside the scope of the TC and, in my view, this proposal might
be defining 2) in an informal way that is specific to WS-RM.
2.
If the RMS somehow guesses that there is a problem on the
EPR to which it is sending its messages and somehow decides that Closing the
sequence and starting a new one is the right course of action, ordering
guarantees are compromised.
<dug> I probably didn't state the problem very well. I didn't
intend to claim that the RMS knew about this affinity, but instead it knew that
something was wrong with the current sequence and in order to try to fix the
situation it decided to try another sequence. The affinity bit was thrown
in there to explain why starting a new sequence _might_ fix the problem.
I should also point out that while a lot of these discussions have focused on
InOrder+ExactlyOnce DA, this feature is still useful in other DAs. For
example, if the DA is just ExactlyOnce - having an accurate accounting of the
ACKs allows a subsequent sequence to send just the gaps from the first, so
getting an accurate list of the gaps becomes critical. And this of course
leads us to the discussion of how to determine the DA in use - which I think
might be part of issues 6, 9, 24 and 27.
</dug>
Finally, I agree with you that considering a gap-filling mechanism would be a
good thing for this TC to do.
--Stefan