OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-rx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [ws-rx] i0019 - a formal proposal - take 2



Stefan,
 comments inline
thanks
-Doug



"Stefan Batres" <stefanba@microsoft.com>

08/29/2005 09:07 PM

To
Doug Davis/Raleigh/IBM@IBMUS, <ws-rx@lists.oasis-open.org>
cc
Subject
RE: [ws-rx] i0019 - a formal proposal - take 2





Doug,
 
I have some questions, more than comments. I’m having trouble imagining when one would use this mechanism given the issues this proposal addresses and I’d like to hear how you think about this. For example, i0019 says:
 
The RM Destination imperatively terminates a sequence due to one of these unrecoverable errors:
- wsrm:SequenceTerminated
- wsrm:MessageNumberRollover
- wsrm:LastMessageNumberExceeded
Then any pending non-acknowledged message will be lost for the sequence.
 
If a RMS receives either wsrm:MessageNumberRollover or wsrm:LastMessageNumberExceeded it means it has a bug in its implementation of the protocol no?
<dug> Not necessarily.  LastMessageNumberExceeded could be as a result of the RMS not knowing the RMD doesn't support the max unsigned long </dug>
 We’re not trying to help it recover from that are we?
<dug> Maybe, maybe not.  When or why an RMS uses CloseSequence is up to it to decide.  All we know is that it wants to shut things down and get an accurate ACK from the RMD.</dug>
W.R.T. wsrm:SequenceTerminated, it seems what we’re trying to do is define a way for the receiver to gracefully end the sequence – but we’re calling it a non-fatal fault. I think it might be clearer if we say that faults are fatal and leave entire sequences in doubt; and add to that consideration of this mechanism explicitly as a way to allow the destinations to initiate a graceful sequence termination.
<dug> I didn't change SequenceTerminated Fault to be non-fatal - so getting/generating that Fault does still end the sequence </dug>
 
i0028 says:
An RMS (or SA) may decide to stop using a sequence even though some messages were not received (not acked)….
 
Why would an RMS “decide” to stop using a sequence? Here is what I’ve heard so far:
<dug> w/o saying that I agree with some of these reasons I'll try to answer each one... </dug>

a)       Because it is going down for some reason (e.g. maintenance).
I don’t see this as a reason for ending an otherwise perfectly good sequence; if you are doing this you could certainly end the session as per the current spec – or if you have durability there should be no problem at all.
<dug> if the RMS is going down and cant' wait for all of the sequence to be ACKd then it must get an accurate accounting of the sequence before it shuts down. W/o CloseSequence() how can it do that? </dug>

b)      Because it implements a message expiration scheme and some of the messages have expired.
There certainly is an issue with the gaps left on the sequence as per the current spec, but the mechanism to deal with this can’t be to end the sequence since ordering guarantees could be lost (e.g. some of the messages expired but it has many more messages to send).
<dug> here you're talking about the notion of filling-in gaps in a sequence.  Which may be a good thing for the TC to examine but I don't see it as being the same issue.  However, w/o a gap-filling-solution, if one message never gets ACKd and you're DA is InOrder+ExaclytOnce, you're screwed.  :-) Unless you use CloseSequence() to get an accurate state of the Seq.  W/o CloseSequence() if you just send a TerminateSequence() there still could be a message or ack floating around the network that could impact the true state of the sequence.  Guess it depends on this 'message expiration' thingy. </dug>
c)       Because it has suffered some sort of partial state loss. For Instance, an RMS multiplexing messages from several sources stored over a single sequence and one of those sources fails.
I see this as having the same problem as b. I see how Close/FinalAck enables the protocol to not doom the entire sequence (bad thing since many apps are using it). But ordering guarantees for those applications is lost. <dug> To be honest, I dunno.  I don't know what it means for one of these sources to fail since in my head the message is already in the RM logic and implies any failure within the AS will not impact RM's job </dug>
 
Do you see this mechanism helping in other scenarios that I’m just not thinking about?
<dug> The case that I keep thinking about is one where the RMD is actually a cluster of machines and when a sequence gets created it has an affinity to a certain server in the cluster - meaning it processes all of the messages for that sequence.  If that server starts to have problems, and for some reason it just can't seem to process any new app messages then the RMS can close down the sequence and start up a new one. Hopefully, the new sequence will be directed to a different server in the cluster.  But even w/o the notion of the RMS trying to do some kind of recovery thru the use of a 2nd sequence (which might be controversial to some people), I still believe it is valuable for the RMS to be able to reliably obtain the state of an incomplete Sequence before a TerminateSequence is sent </dug>
 
--Stefan
 
 



From: Doug Davis [mailto:dug@us.ibm.com]
Sent:
Monday, August 29, 2005 5:07 PM
To:
ws-rx@lists.oasis-open.org
Subject:
RE: [ws-rx] i0019 - a formal proposal - take 2

 

Additional comments inline.

All - any additional comments?  I need to send out 'take 3' tomorrow.

thanks,

-Doug

Jacques Durand <JDurand@us.fujitsu.com>

08/26/2005 08:44 PM


To
"'Giovanni Boschi'" <gboschi@sonicsoftware.com>, Doug Davis/Raleigh/IBM@IBMUS, ws-rx@lists.oasis-open.org
cc
 
Subject
RE: [ws-rx] i0019 - a formal proposal - take 2

 


   





 
Giovani
:
 
I believe there is more in what you say below than what is needed to resolve i019 and i028.

I am commenting on some of  your points below - but I believe they can be largely dissociated from the current issues at hand, and  be treated separately.

 
Regards,

Jacques


 



From:
Giovanni Boschi [mailto:gboschi@sonicsoftware.com]
Sent:
Friday, August 26, 2005 9:01 AM
To:
Jacques Durand; Doug Davis; ws-rx@lists.oasis-open.org
Subject:
RE: [ws-rx] i0019 - a formal proposal - take 2

 

I don't see the current draft as directly specifying that acks are "on receipt", although clearly an implementation could take that approach, and it's probably the more intuitive one - but, specifically I think the current draft allows an RMD to defer acking until the messages are "in order" i.e. not acking those messages that are still sitting "behind gaps".
 
<JD> this is a very important point to clarify. What you are in effect suggesting, is an Ack on delivery (since once in order, they can be delivered.). But the spec is clear:
Acknowledgement: The communication from the RM Destination to the RM Source indicating the successful receipt of a message. (and in the messaging model, there is a clear distinction between Receipt and Delivery, see Fig 1)

<dug> +1, current spec is for Ack on receipt not delivery </dug>

 
 
There is a specific benefit to the "ack when deliverable" (note deliverable, not delivered) approach for low-resource situations (I can elaborate if needed, let me know), so I would hesitate to assume that ack-on-receipt is the model used by all implementations at all times.  

 
<JD> Join the club. I have been favoring "ack on delivery" from the start, but that seems to clash with the WS-RM model: the protocol would become involved in the RMD-AD delivery assurance. Please try to convince Chris...


<dug> I'm staying out of this one for now  :-)  </dug>

 
Now, of course, if my "ack when deliverable" approach is in use by the RMD, then the final ack will be accurate:  all the messages that have been acked are safely deliverable after a close.  It's the "ack immediately on receipt" approach that has that problem - but to be clear, I do not want the spec to impose an ack strategy, I think the freedom the spec gives the RMD on choosing an ack strategy is one of the coolest things in the current spec.

 
<JD> well... too much choice is not necessarily good here: an RMS must preferably know what Ack means to the other party. I prefer the spec to be clear one way or the other about this. I think it is now. Just not the way I prefer - because as drafted today, I maintain, it is OK to acknowledge a message that will never be delivered, and there is no provision for the RMS to know about this. But that is another issue.


<dug> I seem to recall the notion of having some sort of handshaking going on during the CreateSequence where the RMD would communicate the DA in-use back to the RMS.  I suppose the spec could also communicate the ACK strategy as well, if there was more than one choice.  Not really sure I'd want to offer more than one but its something to think about  </dug>

 
I think the general way out of this may be the following:  If the original use case was "I want to close the sequence and have an accurate final ack so I know which ones to resend in a different sequence later", then it seems to me that this is really only viable for sequences that do not have InOrder requirements:  If I will send some of them in sequence S2 later there is no guarantee that they will be delivered in order with respect to the ones I sent in sequence S1 earlier, and I am going to break the InOrder requirement anyway.

 
<JD> I think if InOrder is required but not AtLeastOnce, that means we accept message loss - and therefore we would not have any qualms not resending these in S2. Even if InOrder+ AtLeastOnce is required, some gap may still be there when closing the sequence S1. But again, S2 can forget about the missing messages in S1: the DA is still satisfied if a delivery failure has been notified for the missing,


<dug> A while ago there was a discussion about how to handle the linking of sequences for cases where the MaxMsgNum was hit.  While there wasn't a formal decision by the TC, quite a few people said that that notion was something that should be done at a higher level.  I believe the notion of how to recover a sequence when it is closed prematurely fits into that category as well.  So, while I can see your point about there still being an issue of how to safely do some recovery, I think its another issue.  This current proposal simply focuses on how the RMS can get an accurate accounting of the 'current' sequence when it is closed down early.  What it does with that info - if anything at all - is something else.  And personally, while I do agree there is some higher-level processing that can/should take place in some situations, I do think the RM protocol could help make that processing easier - but as I said, that's another issue. </dug>

 
The RMS knows from the final ack which messages the RMD "has"; if it knew the the RMD<->AD DA, then it would know what to do:

-
         If the DA is InOrder, it knows that it cannot close and then restart a new sequence at all without violating the underlying ordering requirements
-
         If the DA is not InOrder then it can close and restart a new sequence later, and if so it should resend all messages not in the final ack.
 
<JD> These behaviors are somehow out of scope of the spec: there is no requirement on dealing with missing messages across sequences.  That is an optimization that can indeed rely on out-of-band knowledge of the DA.


<dug> yup - current out of scope or that 'higher level' thing I mentioned </dug>

 
But, the RMS does not know the RMD-AD DA; I guess we could propose that the target endpoint publish its DA in its policy (or createSequence, whatever), and I personally think it would be a good thing even for unrelated reasons - but I suspect there could be a lot of opposition - You have to go back to a 2002 version of the member submission to find DA in the policy, and I think this was removed very much intentionally.  But maybe we could propose it and see?

 
A minor point on wording:  I think rather than "MUST not accept" we should say "MUST not deliver to the AD" as in the original text below - "accepting" is not something that we define anywhere and it could be misconstrued.  Not delivering is what matters.

 
<JD> Right for the loose terminology. But again, you are opening a can of worms: we do NOT want these messages to be acknowledged (not juts "not delivered") as soon as the closing is effective. Maybe this would do: ...
RM Destination MUST NOT acknowledge nor deliver any received messages with a Sequence header for the specified sequence, other than those already received at the time the <wsrm:Close> element is processed by the RMD
"-
Jacques

<dug> Well, it can still deliver old messages to the AD it just can't ACK new ones. For example, if msg 3 out of 5 is missing and a Close() comes in, the RMD can still deliver 1 and 2 to the AD (if it hasn't done so already).  It just can't deliver 4 and 5.   I think 'accept' is the right choice.</dug>

 
G.

 

 



From:
Jacques Durand [mailto:JDurand@us.fujitsu.com]
Sent:
Thursday, August 25, 2005 9:36 PM
To:
'Doug Davis'; ws-rx@lists.oasis-open.org
Subject:
RE: [ws-rx] i0019 - a formal proposal - take 2

 

Inline <JD>

 

 



From:
Doug Davis [mailto:dug@us.ibm.com]
Sent:
Thursday, August 25, 2005 5:59 PM
To:
ws-rx@lists.oasis-open.org
Subject:
RE: [ws-rx] i0019 - a formal proposal - take 2

 


When InOrder DA is used the RMS knows that all messages after the first gap were not delivered to the RMD's application - even if they were ACKed.

<JD> InOrder DA in itself does allow delivery of non-contiguous messages ( "...it says nothing about duplications or omission..." Section 2, Core spec)

So, getting an ACK+Final guarantees to the RMS which messages were not just ACKed but delivered - and any messages after the first gap can be recovered (e.g. resent in a new sequence if it wants) without fear of them being processed twice by the RMD's app.
Actually, thinking about it more, perhaps some of the text should remain, like:

When a Sequence is closed and there are messages at the RM Destination
that are waiting for lower-numbered messages to arrive (such as the
case when InOrder delivery is being enforced) before they can be
processed by the RM Destination's application, the RM Destination
MUST NOT deliver those messages.

Just  to ensure that the RMD does not interpret the Close() as the trigger to let all messages after the gap thru to the app.

thanks,

<JD> but again, because the semantics of Ack is just "on receipt" and not "on delivery", an honest RMD developer may decide to Ack these late messages, rendering the final Ack incorrect (or unstable, depending when it is requested...). Another way to avoid adding this text is to make the statement below more general, not limited to "new application messages":

"...
can send a <wsrm:Close> element, in the body of a message, to the RM Destination to indicate that RM Destination MUST NOT accept any new application messages for the specified sequence."
Replace with:

"...
can send a <wsrm:Close> element, in the body of a message, to the RM Destination to indicate that RM Destination MUST NOT accept any application messages for the specified sequence, other than those already received at the time the <wsrm:Close> element is interpreted by the RMD."
-jacques


-Doug

"Giovanni Boschi" <gboschi@sonicsoftware.com>

08/25/2005 08:48 PM

 


To
Doug Davis/Raleigh/IBM@IBMUS, "Jacques Durand" <JDurand@us.fujitsu.com>
cc
<ws-rx@lists.oasis-open.org>
Subject
RE: [ws-rx] i0019 - a formal proposal - take 2


 

 


   





If the RMD has already acked the out-of-order messages (and the spec at this point doesn't say it can't or shouldn't), and we then preclude the RMD from delivering them, then the final Ack is not accurate, which I thought was the original goal.  Even if we leave it undefined, the RMD may choose not to deliver them, and the problem remains.


G.


 



 





From:
Doug Davis [mailto:dug@us.ibm.com]
Sent:
Thursday, August 25, 2005 7:23 PM
To:
Jacques Durand
Cc:
ws-rx@lists.oasis-open.org
Subject:
RE: [ws-rx] i0019 - a formal proposal - take 2



Jacques Durand <JDurand@us.fujitsu.com> wrote on 08/25/2005 02:10:04 PM:


>  When a Sequence is closed and there are messages at the RM Destination
>  that are waiting for lower-numbered messages to arrive (such as the
>  case when InOrder delivery is being enforced) before they can be
>  processed by the RM Destination's application, the RM Destination
>  MUST NOT deliver those messages and a SequenceClosed fault MUST
>  be generated for each one.
> <JD> it is important to also say that it should not acknowledge them either.

If we change it so that it says nothing about those messages instead,

as Anish and Chris are suggesting, would that be ok with you?

So, basically, the semantics of undelivered messages would be undefined by

removing the above paragraph.

thanks,

-Doug



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]