RE: [ws-rx] i0019 - a formal proposal

Doug,

I have some questions, more than comments. I’m having trouble imagining when one would use this mechanism given the issues this proposal addresses and I’d like to hear how you think about this. For example, i0019 says:

The RM Destination imperatively terminates a sequence due to one of these unrecoverable errors:

- wsrm:SequenceTerminated

- wsrm:MessageNumberRollover

- wsrm:LastMessageNumberExceeded

Then any pending non-acknowledged message will be lost for the sequence.

If a RMS receives either wsrm:MessageNumberRollover or wsrm:LastMessageNumberExceeded it means it has a bug in its implementation of the protocol no? We’re not trying to help it recover from that are we? W.R.T. wsrm:SequenceTerminated, it seems what we’re trying to do is define a way for the receiver to gracefully end the sequence – but we’re calling it a non-fatal fault. I think it might be clearer if we say that faults are fatal and leave entire sequences in doubt; and add to that consideration of this mechanism explicitly as a way to allow the destinations to initiate a graceful sequence termination.

i0028 says:

An RMS (or SA) may decide to stop using a sequence even though some messages were not received (not acked)….

Why would an RMS “decide” to stop using a sequence? Here is what I’ve heard so far:

a) Because it is going down for some reason (e.g. maintenance).

I don’t see this as a reason for ending an otherwise perfectly good sequence; if you are doing this you could certainly end the session as per the current spec – or if you have durability there should be no problem at all.

b) Because it implements a message expiration scheme and some of the messages have expired.

There certainly is an issue with the gaps left on the sequence as per the current spec, but the mechanism to deal with this can’t be to end the sequence since ordering guarantees could be lost (e.g. some of the messages expired but it has many more messages to send).

c) Because it has suffered some sort of partial state loss. For Instance, an RMS multiplexing messages from several sources stored over a single sequence and one of those sources fails.

I see this as having the same problem as b. I see how Close/FinalAck enables the protocol to not doom the entire sequence (bad thing since many apps are using it). But ordering guarantees for those applications is lost.

Do you see this mechanism helping in other scenarios that I’m just not thinking about?

--Stefan

From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Monday, August 29, 2005 5:07 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

Additional comments inline.
All - any additional comments? I need to send out 'take 3' tomorrow.
thanks,
-Doug

Jacques Durand <JDurand@us.fujitsu.com>

08/26/2005 08:44 PM

To	"'Giovanni Boschi'" <gboschi@sonicsoftware.com>, Doug Davis/Raleigh/IBM@IBMUS, ws-rx@lists.oasis-open.org
cc
Subject	RE: [ws-rx] i0019 - a formal proposal - take 2

Giovani:

I believe there is more in what you say below than what is needed to resolve i019 and i028.
I am commenting on some of your points below - but I believe they can be largely dissociated from the current issues at hand, and be treated separately.

Regards,
Jacques

From: Giovanni Boschi [mailto:gboschi@sonicsoftware.com]
Sent: Friday, August 26, 2005 9:01 AM
To: Jacques Durand; Doug Davis; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

I don't see the current draft as directly specifying that acks are "on receipt", although clearly an implementation could take that approach, and it's probably the more intuitive one - but, specifically I think the current draft allows an RMD to defer acking until the messages are "in order" i.e. not acking those messages that are still sitting "behind gaps".

<JD> this is a very important point to clarify. What you are in effect suggesting, is an Ack on delivery (since once in order, they can be delivered.). But the spec is clear: Acknowledgement: The communication from the RM Destination to the RM Source indicating the successful receipt of a message. (and in the messaging model, there is a clear distinction between Receipt and Delivery, see Fig 1)

<dug> +1, current spec is for Ack on receipt not delivery </dug>

There is a specific benefit to the "ack when deliverable" (note deliverable, not delivered) approach for low-resource situations (I can elaborate if needed, let me know), so I would hesitate to assume that ack-on-receipt is the model used by all implementations at all times.

<JD> Join the club. I have been favoring "ack on delivery" from the start, but that seems to clash with the WS-RM model: the protocol would become involved in the RMD-AD delivery assurance. Please try to convince Chris...

<dug> I'm staying out of this one for now :-) </dug>

Now, of course, if my "ack when deliverable" approach is in use by the RMD, then the final ack will be accurate: all the messages that have been acked are safely deliverable after a close. It's the "ack immediately on receipt" approach that has that problem - but to be clear, I do not want the spec to impose an ack strategy, I think the freedom the spec gives the RMD on choosing an ack strategy is one of the coolest things in the current spec.

<JD> well... too much choice is not necessarily good here: an RMS must preferably know what Ack means to the other party. I prefer the spec to be clear one way or the other about this. I think it is now. Just not the way I prefer - because as drafted today, I maintain, it is OK to acknowledge a message that will never be delivered, and there is no provision for the RMS to know about this. But that is another issue.

<dug> I seem to recall the notion of having some sort of handshaking going on during the CreateSequence where the RMD would communicate the DA in-use back to the RMS. I suppose the spec could also communicate the ACK strategy as well, if there was more than one choice. Not really sure I'd want to offer more than one but its something to think about </dug>

I think the general way out of this may be the following: If the original use case was "I want to close the sequence and have an accurate final ack so I know which ones to resend in a different sequence later", then it seems to me that this is really only viable for sequences that do not have InOrder requirements: If I will send some of them in sequence S2 later there is no guarantee that they will be delivered in order with respect to the ones I sent in sequence S1 earlier, and I am going to break the InOrder requirement anyway.

<JD> I think if InOrder is required but not AtLeastOnce, that means we accept message loss - and therefore we would not have any qualms not resending these in S2. Even if InOrder+ AtLeastOnce is required, some gap may still be there when closing the sequence S1. But again, S2 can forget about the missing messages in S1: the DA is still satisfied if a delivery failure has been notified for the missing,

<dug> A while ago there was a discussion about how to handle the linking of sequences for cases where the MaxMsgNum was hit. While there wasn't a formal decision by the TC, quite a few people said that that notion was something that should be done at a higher level. I believe the notion of how to recover a sequence when it is closed prematurely fits into that category as well. So, while I can see your point about there still being an issue of how to safely do some recovery, I think its another issue. This current proposal simply focuses on how the RMS can get an accurate accounting of the 'current' sequence when it is closed down early. What it does with that info - if anything at all - is something else. And personally, while I do agree there is some higher-level processing that can/should take place in some situations, I do think the RM protocol could help make that processing easier - but as I said, that's another issue. </dug>

The RMS knows from the final ack which messages the RMD "has"; if it knew the the RMD<->AD DA, then it would know what to do:
- If the DA is InOrder, it knows that it cannot close and then restart a new sequence at all without violating the underlying ordering requirements
- If the DA is not InOrder then it can close and restart a new sequence later, and if so it should resend all messages not in the final ack.

<JD> These behaviors are somehow out of scope of the spec: there is no requirement on dealing with missing messages across sequences. That is an optimization that can indeed rely on out-of-band knowledge of the DA.

<dug> yup - current out of scope or that 'higher level' thing I mentioned </dug>

But, the RMS does not know the RMD-AD DA; I guess we could propose that the target endpoint publish its DA in its policy (or createSequence, whatever), and I personally think it would be a good thing even for unrelated reasons - but I suspect there could be a lot of opposition - You have to go back to a 2002 version of the member submission to find DA in the policy, and I think this was removed very much intentionally. But maybe we could propose it and see?

A minor point on wording: I think rather than "MUST not accept" we should say "MUST not deliver to the AD" as in the original text below - "accepting" is not something that we define anywhere and it could be misconstrued. Not delivering is what matters.

<JD> Right for the loose terminology. But again, you are opening a can of worms: we do NOT want these messages to be acknowledged (not juts "not delivered") as soon as the closing is effective. Maybe this would do: ...RM Destination MUST NOT acknowledge nor deliver any received messages with a Sequence header for the specified sequence, other than those already received at the time the <wsrm:Close> element is processed by the RMD
"-Jacques

<dug> Well, it can still deliver old messages to the AD it just can't ACK new ones. For example, if msg 3 out of 5 is missing and a Close() comes in, the RMD can still deliver 1 and 2 to the AD (if it hasn't done so already). It just can't deliver 4 and 5. I think 'accept' is the right choice.</dug>

G.

From: Jacques Durand [mailto:JDurand@us.fujitsu.com]
Sent: Thursday, August 25, 2005 9:36 PM
To: 'Doug Davis'; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

Inline <JD>

From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Thursday, August 25, 2005 5:59 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

When InOrder DA is used the RMS knows that all messages after the first gap were not delivered to the RMD's application - even if they were ACKed.
<JD> InOrder DA in itself does allow delivery of non-contiguous messages ( "...it says nothing about duplications or omission..." Section 2, Core spec)
So, getting an ACK+Final guarantees to the RMS which messages were not just ACKed but delivered - and any messages after the first gap can be recovered (e.g. resent in a new sequence if it wants) without fear of them being processed twice by the RMD's app.
Actually, thinking about it more, perhaps some of the text should remain, like:
When a Sequence is closed and there are messages at the RM Destination
that are waiting for lower-numbered messages to arrive (such as the
case when InOrder delivery is being enforced) before they can be
processed by the RM Destination's application, the RM Destination
MUST NOT deliver those messages.
Just to ensure that the RMD does not interpret the Close() as the trigger to let all messages after the gap thru to the app.
thanks,
<JD> but again, because the semantics of Ack is just "on receipt" and not "on delivery", an honest RMD developer may decide to Ack these late messages, rendering the final Ack incorrect (or unstable, depending when it is requested...). Another way to avoid adding this text is to make the statement below more general, not limited to "new application messages":
"...can send a <wsrm:Close> element, in the body of a message, to the RM Destination to indicate that RM Destination MUST NOT accept any new application messages for the specified sequence."
Replace with:
"...can send a <wsrm:Close> element, in the body of a message, to the RM Destination to indicate that RM Destination MUST NOT accept any application messages for the specified sequence, other than those already received at the time the <wsrm:Close> element is interpreted by the RMD."
-jacques

-Doug

"Giovanni Boschi" <gboschi@sonicsoftware.com>

08/25/2005 08:48 PM

To	Doug Davis/Raleigh/IBM@IBMUS, "Jacques Durand" <JDurand@us.fujitsu.com>
cc	<ws-rx@lists.oasis-open.org>
Subject	RE: [ws-rx] i0019 - a formal proposal - take 2

If the RMD has already acked the out-of-order messages (and the spec at this point doesn't say it can't or shouldn't), and we then preclude the RMD from delivering them, then the final Ack is not accurate, which I thought was the original goal. Even if we leave it undefined, the RMD may choose not to deliver them, and the problem remains.

G.

From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Thursday, August 25, 2005 7:23 PM
To: Jacques Durand
Cc: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] i0019 - a formal proposal - take 2

Jacques Durand <JDurand@us.fujitsu.com> wrote on 08/25/2005 02:10:04 PM:

> When a Sequence is closed and there are messages at the RM Destination
> that are waiting for lower-numbered messages to arrive (such as the
> case when InOrder delivery is being enforced) before they can be
> processed by the RM Destination's application, the RM Destination
> MUST NOT deliver those messages and a SequenceClosed fault MUST
> be generated for each one.
> <JD> it is important to also say that it should not acknowledge them either.

If we change it so that it says nothing about those messages instead,
as Anish and Chris are suggesting, would that be ok with you?
So, basically, the semantics of undelivered messages would be undefined by
removing the above paragraph.
thanks,
-Doug

ws-rx message