ws-rx message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
- From: Doug Davis <dug@us.ibm.com>
- To: ws-rx@lists.oasis-open.org
- Date: Tue, 30 Aug 2005 10:56:24 -0400
Stefan,
comments inline
thanks
-Doug
"Stefan Batres"
<stefanba@microsoft.com>
08/29/2005 09:07 PM
|
To
| Doug Davis/Raleigh/IBM@IBMUS,
<ws-rx@lists.oasis-open.org>
|
cc
|
|
Subject
| RE: [ws-rx] i0019 - a formal
proposal - take 2 |
|
Doug,
I have some questions, more
than comments. I’m having trouble imagining when one would use this mechanism
given the issues this proposal addresses and I’d like to hear how you
think about this. For example, i0019 says:
The RM Destination imperatively
terminates a sequence due to one of these unrecoverable errors:
- wsrm:SequenceTerminated
- wsrm:MessageNumberRollover
- wsrm:LastMessageNumberExceeded
Then any pending non-acknowledged
message will be lost for the sequence.
If a RMS receives either wsrm:MessageNumberRollover
or wsrm:LastMessageNumberExceeded it means it has a bug in its implementation
of the protocol no?
<dug> Not necessarily. LastMessageNumberExceeded
could be as a result of the RMS not knowing the RMD doesn't support the
max unsigned long </dug>
We’re not trying to
help it recover from that are we?
<dug> Maybe, maybe not. When
or why an RMS uses CloseSequence is up to it to decide. All we know
is that it wants to shut things down and get an accurate ACK from the RMD.</dug>
W.R.T. wsrm:SequenceTerminated,
it seems what we’re trying to do is define a way for the receiver to gracefully
end the sequence – but we’re calling it a non-fatal fault. I think it
might be clearer if we say that faults are fatal and leave entire sequences
in doubt; and add to that consideration of this mechanism explicitly as
a way to allow the destinations to initiate a graceful sequence termination.
<dug> I didn't change SequenceTerminated
Fault to be non-fatal - so getting/generating that Fault does still end
the sequence </dug>
i0028 says:
An RMS (or SA) may decide to
stop using a sequence even though some messages were not received (not
acked)….
Why would an RMS “decide”
to stop using a sequence? Here is what I’ve heard so far:
<dug> w/o saying that I agree with some of these reasons I'll try
to answer each one... </dug>
a)
Because
it is going down for some reason (e.g. maintenance).
I don’t see this as a reason
for ending an otherwise perfectly good sequence; if you are doing this
you could certainly end the session as per the current spec – or if you
have durability there should be no problem at all.
<dug> if the RMS is going down and cant' wait for all of the sequence
to be ACKd then it must get an accurate accounting of the sequence before
it shuts down. W/o CloseSequence() how can it do that? </dug>
b)
Because
it implements a message expiration scheme and some of the messages have
expired.
There certainly is an issue
with the gaps left on the sequence as per the current spec, but the mechanism
to deal with this can’t be to end the sequence since ordering guarantees
could be lost (e.g. some of the messages expired but it has many more messages
to send).
<dug> here you're talking about
the notion of filling-in gaps in a sequence. Which may be a good
thing for the TC to examine but I don't see it as being the same issue.
However, w/o a gap-filling-solution, if one message never gets ACKd
and you're DA is InOrder+ExaclytOnce, you're screwed. :-) Unless
you use CloseSequence() to get an accurate state of the Seq. W/o
CloseSequence() if you just send a TerminateSequence() there still could
be a message or ack floating around the network that could impact the true
state of the sequence. Guess it depends on this 'message expiration'
thingy. </dug>
c)
Because
it has suffered some sort of partial state loss. For Instance, an RMS multiplexing
messages from several sources stored over a single sequence and one of
those sources fails.
I see this as having the same
problem as b. I see how Close/FinalAck enables the protocol to not doom
the entire sequence (bad thing since many apps are using it). But ordering
guarantees for those applications is lost.
<dug> To be honest, I dunno. I don't know what it means for
one of these sources to fail since in my head the message is already in
the RM logic and implies any failure within the AS will not impact RM's
job </dug>
Do you see this mechanism helping
in other scenarios that I’m just not thinking about?
<dug> The case that I keep thinking
about is one where the RMD is actually a cluster of machines and when a
sequence gets created it has an affinity to a certain server in the cluster
- meaning it processes all of the messages for that sequence. If
that server starts to have problems, and for some reason it just can't
seem to process any new app messages then the RMS can close down the sequence
and start up a new one. Hopefully, the new sequence will be directed to
a different server in the cluster. But even w/o the notion of the
RMS trying to do some kind of recovery thru the use of a 2nd sequence (which
might be controversial to some people), I still believe it is valuable
for the RMS to be able to reliably obtain the state of an incomplete Sequence
before a TerminateSequence is sent </dug>
--Stefan
From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Monday, August 29, 2005 5:07 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
Additional comments inline.
All - any additional comments? I need to send out 'take 3' tomorrow.
thanks,
-Doug
Jacques Durand <JDurand@us.fujitsu.com>
08/26/2005 08:44 PM
|
To
| "'Giovanni Boschi'"
<gboschi@sonicsoftware.com>, Doug Davis/Raleigh/IBM@IBMUS, ws-rx@lists.oasis-open.org
|
cc
|
|
Subject
| RE: [ws-rx] i0019 - a formal proposal
- take 2 |
|
Giovani:
I believe there is more in what you say below than what is needed to resolve
i019 and i028.
I am commenting on some of your points below - but I believe they
can be largely dissociated from the current issues at hand, and be
treated separately.
Regards,
Jacques
From: Giovanni Boschi [mailto:gboschi@sonicsoftware.com]
Sent: Friday, August 26, 2005 9:01 AM
To: Jacques Durand; Doug Davis; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
I don't see the current draft as directly specifying that acks are "on
receipt", although clearly an implementation could take that approach,
and it's probably the more intuitive one - but, specifically I think the
current draft allows an RMD to defer acking until the messages are "in
order" i.e. not acking those messages that are still sitting "behind
gaps".
<JD> this is a very important point to clarify. What you are in effect
suggesting, is an Ack on delivery (since once in order, they can be delivered.).
But the spec is clear: Acknowledgement:
The communication from the RM Destination to the RM Source indicating the
successful receipt of a message. (and
in the messaging model, there is a clear distinction between Receipt and
Delivery, see Fig 1)
<dug> +1, current spec is for Ack on receipt not delivery </dug>
There is a specific benefit to the "ack when deliverable" (note
deliverable, not delivered) approach for low-resource situations (I can
elaborate if needed, let me know), so I would hesitate to assume that ack-on-receipt
is the model used by all implementations at all times.
<JD> Join the club. I have been favoring "ack on delivery"
from the start, but that seems to clash with the WS-RM model: the protocol
would become involved in the RMD-AD delivery assurance. Please try to convince
Chris...
<dug> I'm staying out of this one for now :-) </dug>
Now, of course, if my "ack when deliverable" approach is in use
by the RMD, then the final ack will be accurate: all the messages
that have been acked are safely deliverable after a close. It's the
"ack immediately on receipt" approach that has that problem -
but to be clear, I do not want the spec to impose an ack strategy, I think
the freedom the spec gives the RMD on choosing an ack strategy is one of
the coolest things in the current spec.
<JD> well... too much choice is not necessarily good here: an RMS
must preferably know what Ack means to the other party. I prefer the spec
to be clear one way or the other about this. I think it is now. Just not
the way I prefer - because as drafted today, I maintain, it is OK to acknowledge
a message that will never be delivered, and there is no provision for the
RMS to know about this. But that is another issue.
<dug> I seem to recall the notion of having some sort of handshaking
going on during the CreateSequence where the RMD would communicate the
DA in-use back to the RMS. I suppose the spec could also communicate
the ACK strategy as well, if there was more than one choice. Not
really sure I'd want to offer more than one but its something to think
about </dug>
I think the general way out of this may be the following: If the
original use case was "I want to close the sequence and have an accurate
final ack so I know which ones to resend in a different sequence later",
then it seems to me that this is really only viable for sequences that
do not have InOrder requirements: If I will send some of them in
sequence S2 later there is no guarantee that they will be delivered in
order with respect to the ones I sent in sequence S1 earlier, and I am
going to break the InOrder requirement anyway.
<JD> I think if InOrder is required but not AtLeastOnce, that means
we accept message loss - and therefore we would not have any qualms not
resending these in S2. Even if InOrder+ AtLeastOnce is required, some gap
may still be there when closing the sequence S1. But again, S2 can forget
about the missing messages in S1: the DA is still satisfied if a delivery
failure has been notified for the missing,
<dug> A while ago there was a discussion about how to handle the
linking of sequences for cases where the MaxMsgNum was hit. While
there wasn't a formal decision by the TC, quite a few people said that
that notion was something that should be done at a higher level. I
believe the notion of how to recover a sequence when it is closed prematurely
fits into that category as well. So, while I can see your point about
there still being an issue of how to safely do some recovery, I think its
another issue. This current proposal simply focuses on how the RMS
can get an accurate accounting of the 'current' sequence when it is closed
down early. What it does with that info - if anything at all - is
something else. And personally, while I do agree there is some higher-level
processing that can/should take place in some situations, I do think the
RM protocol could help make that processing easier - but as I said, that's
another issue. </dug>
The RMS knows from the final ack which messages the RMD "has";
if it knew the the RMD<->AD DA, then it would know what to do:
-
If the
DA is InOrder, it knows that it cannot close and then restart a new sequence
at all without violating the underlying ordering requirements
-
If the
DA is not InOrder then it can close and restart a new sequence later, and
if so it should resend all messages not in the final ack.
<JD> These behaviors are somehow out of scope of the spec: there
is no requirement on dealing with missing messages across sequences. That
is an optimization that can indeed rely on out-of-band knowledge of the
DA.
<dug> yup - current out of scope or that 'higher level' thing I mentioned
</dug>
But, the RMS does not know the RMD-AD DA; I guess we could propose that
the target endpoint publish its DA in its policy (or createSequence, whatever),
and I personally think it would be a good thing even for unrelated reasons
- but I suspect there could be a lot of opposition - You have to go back
to a 2002 version of the member submission to find DA in the policy, and
I think this was removed very much intentionally. But maybe we could
propose it and see?
A minor point on wording: I think rather than "MUST not accept"
we should say "MUST not deliver to the AD" as in the original
text below - "accepting" is not something that we define anywhere
and it could be misconstrued. Not delivering is what matters.
<JD> Right for the loose terminology. But again, you are opening
a can of worms: we do NOT want these messages to be acknowledged (not juts
"not delivered") as soon as the closing is effective. Maybe this
would do: ...RM Destination
MUST NOT acknowledge nor deliver any received messages with a Sequence
header for the specified sequence, other than those already received at
the time the <wsrm:Close> element is processed by the RMD
"-Jacques
<dug> Well, it can still deliver old messages to the AD it just can't
ACK new ones. For example, if msg 3 out of 5 is missing and a Close() comes
in, the RMD can still deliver 1 and 2 to the AD (if it hasn't done so already).
It just can't deliver 4 and 5. I think 'accept' is the right
choice.</dug>
G.
From: Jacques Durand [mailto:JDurand@us.fujitsu.com]
Sent: Thursday, August 25, 2005 9:36 PM
To: 'Doug Davis'; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
Inline <JD>
From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Thursday, August 25, 2005 5:59 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
When InOrder DA is used the RMS knows that all messages after the first
gap were not delivered to the RMD's application - even if they were ACKed.
<JD> InOrder DA in itself does allow delivery of non-contiguous messages
( "...it says nothing about duplications or omission..." Section
2, Core spec)
So, getting an ACK+Final guarantees to the RMS which messages were not
just ACKed but delivered - and any messages after the first gap can be
recovered (e.g. resent in a new sequence if it wants) without fear of them
being processed twice by the RMD's app.
Actually, thinking about it more, perhaps some of the text should remain,
like:
When a Sequence is closed and there are messages at the RM Destination
that are waiting for lower-numbered messages to arrive (such as the
case when InOrder delivery is being enforced) before they can be
processed by the RM Destination's application, the RM Destination
MUST NOT deliver those messages.
Just to ensure that the RMD does not interpret the Close() as the
trigger to let all messages after the gap thru to the app.
thanks,
<JD> but again, because the semantics of Ack is just "on receipt"
and not "on delivery", an honest RMD developer may decide to
Ack these late messages, rendering the final Ack incorrect (or unstable,
depending when it is requested...). Another way to avoid adding this text
is to make the statement below more general, not limited to "new application
messages":
"...can send a <wsrm:Close>
element, in the body of a message, to the RM Destination to indicate that
RM Destination MUST NOT accept any new application messages for the specified
sequence."
Replace with:
"...can send a <wsrm:Close>
element, in the body of a message, to the RM Destination to indicate that
RM Destination MUST NOT accept any application messages for the specified
sequence, other than those already received at the time the <wsrm:Close>
element is interpreted by the RMD."
-jacques
-Doug
"Giovanni Boschi"
<gboschi@sonicsoftware.com>
08/25/2005 08:48 PM
|
To
| Doug Davis/Raleigh/IBM@IBMUS,
"Jacques Durand" <JDurand@us.fujitsu.com>
|
cc
| <ws-rx@lists.oasis-open.org>
|
Subject
| RE: [ws-rx] i0019 - a formal proposal
- take 2 |
|
If the RMD has already acked the out-of-order messages (and the spec at
this point doesn't say it can't or shouldn't), and we then preclude the
RMD from delivering them, then the final Ack is not accurate, which I thought
was the original goal. Even if we leave it undefined, the RMD may
choose not to deliver them, and the problem remains.
G.
From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Thursday, August 25, 2005 7:23 PM
To: Jacques Durand
Cc: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2
Jacques Durand <JDurand@us.fujitsu.com> wrote on 08/25/2005 02:10:04
PM:
> When a Sequence is closed and there are messages at the RM Destination
> that are waiting for lower-numbered messages to arrive (such
as the
> case when InOrder delivery is being enforced) before they can
be
> processed by the RM Destination's application, the RM Destination
> MUST NOT deliver those messages and a SequenceClosed fault MUST
> be generated for each one.
> <JD> it is important to also say that it should not acknowledge
them either.
If we change it so that it says nothing about those messages instead,
as Anish and Chris are suggesting, would that be ok with you?
So, basically, the semantics of undelivered messages would be undefined
by
removing the above paragraph.
thanks,
-Doug
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]