ws-rx message

Subject: Re: [ws-rx] DA and Protocol: is the date over?

From: Dan Leshchiner <dleshc@tibco.com>
To: Christopher B Ferris <chrisfer@us.ibm.com>
Date: Fri, 19 Aug 2005 11:01:26 -0700

Chris,

i don't think your "elephant gun for a gnat" analogy is appropriate. timestamps usually are not useful for this case (AtMostOnce/InOrder with no retransmissions) because time can go backwards (at least once a year and upon administrative action). so, really, one needs sequence numbers. but, then, because of re-boots and devices with non-persistent clocks (e.g. printer), one needs sequence identifier to scope your sequence numbers. that, in turn, brings in sequence identifier establishment/tear-down handshake. by this time, you've got a very substantial chunk of the protocol.

in any case, what if RMS's limited resources allow it to hold for retransmissions only messages sent in past 30 seconds? even with SOAP over HTTP as a transport, if all that happened was that listen() backlog limit was exceeded on the server, retransmissions are possible. is this use case out of scope? if not, full WS-RM machinery is needed to support it. in other words, "reliable" in "reliable messaging" is in the eye of the beholder. what's very little or no reliability for some is a lot of reliability to others.

all of this said, i agree with you that DA is out-of-band as far as wire protocol is concerned. it's just that low/no retransmission availability use case is something to keep in mind.

thanks,
dan

Christopher B Ferris wrote:

Jacques,

There's no doubt that there might be use cases for AtMostOnce delivery of 
messages *between* the
RMS and RMD. However, as I recall someone (Umit?) saying either on the 
list or on IRC, that that isn't 
reliable messaging.

I would agree.

In fact, RM would be overkill, using an elephant gun to kill a gnat, 
especially if InOrder is not required. 
What possible purpose would an acknowledgement serve if the RMS didn't 
care that all of its 
messages were received, and if it were not doing retransmission of 
messages (because it didn't care 
one way or the other if the messages were all delivered)? Fire and forget 
is probably good enough 
for that use case. Ordered processing could be facilitated by inspecting 
timestamps and discarding
messages that had a timestamp value less than the highest value already 
processed.

Again, I think we need to keep in mind the separation of concerns that I 
mentioned in the F2F briefing.
RM is concerned only with ensuring that messages are transmitted 
successfully from the RMS to the
RMD. Period. While the source application may want that all of the 
messages be processed by the destination
application, there is nothing that the RMD can do to ensure that 
guarantee. The destination application
might be implemented as a separate, distributed component, and be taken 
off line (whether voluntarily
or not) never to return, despite the fact that there are unprocessed, yet 
acknowledged, messages in the 
RMD's message store. 

The point of RM is to ensure that all messages transmitted by the RMS are 
successfully *received* by 
the RMD. The RMS is responsible to retransmit unacknowledged messages. The 
RMD is responsible
for acknowledging receipt of each message so that the RMS can discontinue 
retransmission of 
successfully received messages.

That is the extent of the scope of the protocol. It does just that one 
thing (recall Curly's Law [1]). It's purpose
is to provide better QoS than can be achieved using TCP/IP alone (which 
itself has reliability characteristics
built-in, but offers no guarantees about anything above the TCP/IP layer). 
The same applies at the
IP layer, and so on, down the OSI seven layer cake.

If the application wants some assurance that the destination application 
received and processed
(or faulted) the message, then you need an application-level message for 
that (e.g. a business-level
ack).

The DA has relevance at the RMD as it specifies the QoS contract that the 
RMD offers the application
destination. In the case of AtMostOnce, it is saying "look, I'll do my 
best to ensure that you get all of the
messages, but I may have limited resources and you MAY lose some messages. 
You want a better
guarantee than that, try another RMD provider.".

[1] http://www.imdb.com/title/tt0101587/

Christopher Ferris
STSM, Emerging e-business Industry Architecture
email: chrisfer@us.ibm.com
blog: http://webpages.charter.net/chrisfer/blog.html
phone: +1 508 377 9295

Jacques Durand <JDurand@us.fujitsu.com> wrote on 08/18/2005 03:08:31 PM:

Since Anish opened this Pandora box, I must say I share his confusion

about the rationale for

separating the protocol mechanism from DA. 
That could well be the mother of all issues that we still need to

discuss on WS-RM, surely

deserving its own thread here (so I resubjected the mail).
Chris may have some use cases or rationale for that, but we have not

seen the detail of them yet.

On my side, I can see the case of a monitoring device that only needs

AtMostOnce for sending its

measures (maybe combined with InOrder), and that cannot afford a

resending mechanism that it does

not need, nor does it care about interpreting Acks. So the protocol

would be affected by DA

here... (note that even with AtLeastOnce, the tuning of resending

parameters depends on a Policy

assertion, and we could make the case that such parameters can be seen

as DA QoS parameters.)

Anish, you are the one who started this... 
Jacques 
-----Original Message----- 
From: Anish Karmarkar [mailto:Anish.Karmarkar@oracle.com] 
Sent: Thursday, August 18, 2005 11:15 AM 
To: Jacques Durand 
Cc: 'Christopher B Ferris'; ws-rx@lists.oasis-open.org 
Subject: Re: [ws-rx] NEW ISSUE, twin sister of i019 
Jacques Durand wrote:

Chris: inline <JD> 

A meta-comment for those who may worry about the amount of discussion 
triggered by just one (or two) issues among... about 20 remaining

issues?

I think that behind the issues discussed here is a fundamental 
discussion touching at the protocol model and meaning of delivery 
assurance. There are many aspects of the model behind WS-RM that were 
not apparent for all TC joiners, and I guess that explains the 
understanding process (and questioning) going on.

One of the problems I have had in the past (till Chris explained that 
the protocol is AtLeastOnce on the wire) was about Delivery Assurance 
and how/why it does/doesn't affect the protocol on the wire. The 
intention, as mentioned by Chris, that the protocol is AtLeastOnce on 
the wire is not apparent from reading the spec (at least to me). There 
are statements that contradict this or are misleading. For example, 
Section 2, 2nd para: 
"WS-ReliableMessaging provides an interoperable protocol that a Reliable

Messaging (RM) Source and Reliable Messaging (RM) Destination use to 
provide Application Source and Destination a guarantee that a message 
that is sent will be delivered.  The guarantee is specified as a 
delivery assurance.  The protocol supports the endpoints in providing 
these delivery assurances.  It is the responsibility of the RM Source 
and RM Destination to fulfill the delivery assurances, or raise an 
error.    The protocol defined here allows endpoints to meet this 
guarantee for the delivery assurances defined below. " 
At the very least there is an editorial issue here. 
-Anish 
--

Once that is being clarified and maybe refined, I certainly hope we

will

not need to go through this level of discussion at each issue... 

Jacques 


-----Original Message----- 
From: Christopher B Ferris [mailto:chrisfer@us.ibm.com] 
Sent: Friday, August 12, 2005 5:30 AM 
To: ws-rx@lists.oasis-open.org 
Subject: RE: [ws-rx] NEW ISSUE, twin sister of i019 

Jacques, 

Taking an approach of "what is not forbidden is allowed" in

implementing a

spec will almost invariably 
lead to interoperability problems when applied to areas of the spec

that

prescribe what to do. 

<JD> I didn't pretend to make this my motto - but we all know that

this

is a question that every developer faces from time to time... I guess 
this is one of those areas where the keyword "SHALL NOT" can help a

lot.

As to the point of using TerminateSequence on an incomplete Sequence,

don't disagree that there 
is utility in providing a means for the RMS to terminate a Sequence.

That

is what the SequenceTerminated 
fault is for. Note that either endpoint may issue the

SequenceTerminated

fault: 
Sequence Terminated 
This fault is sent by either the RM Source or the RM Destination to 
indicate that the endpoint that generates the fault has either

encountered

an unrecoverable condition, or has detected a violation of the

protocol

and as a consequence, has chosen to terminate the sequence.  The

endpoint

that generates this fault should make every reasonable effort to

notify

the corresponding endpoint of this decision. 

This leads us back to issue i019 unless we raise and resolve an issue

that

says roughly: is the RMS *required* to retransmit messages 
that are unacknowledged. 

<JD> mmmh, I never saw this as a key thing: retransmission - whether 
required or not - is a limited effort that can ultimately fail for 
whatever reason. Regardless of this effort, won't we inevitably face 
situations where there are gaps in the sequence that we have to live 
with till the end of the seq? But I think i019 stands regardless of

this

(see later). 


Note though that this changes the nature of the 
protocol significantly. My personal take on it is that the 
answer to the previous question is "yes" and that the only meaningful

way

of optimizing AtMostOnce DA  with regards to messages on the wire (if

that

is what we want) is to permit the RMD to acknowledge messages that it

has

not received (e.g. pre-emptively filling in gaps) but that the 
RMS is still required to retransmit unacknowledged messages (just to

keep

the protocol simple). 

<JD> Chris: isn't that rather weird ??? I hope we don't need to

stretch

the interpretation of acknowledgement (you seem to suggest that 
acknowledgement may have a different meaning depending on the DA in 
use). Note that i019 applies regardless of the DA in use: it only says

that the RMS has no way to get an accurate account of *actually* 
received vs not-received messages at the time the Fault terminates the

sequence on RMD (because the latest SequenceAck obtained by RMS may

show

unacknowledged messages for which we don't know what happened.) That

can

still be of great importance especially in the case of ExactlyOnce: 
knowing that a message was for sure never received by RMD, allows for

meaningful failure notice to SA on which it can act (and I think it 
would also for AtMostOnce, since you are using ack mechanism there

too).

As long as accuracy of final ack status is considered a valid 
expectation, i019 is a valid issue IMO.   </JD> 


That would effectively provide the 
protocol with the "forget before" that is needed in order to allow for

some messages to be dropped between the RMS and RMD without 
changing (and IMO significanltly over-complicating) the nature of the 
protocol. 

<JD> I think this is precisely the crux of this issue: do we want the 
RMS to forget which messages were not received for the sequence when

it

terminates? (regardless on how it terminates) that seems to be the

first

question we need to address apparently. 

Cheers, 
Jacques 

Jacques Durand <JDurand@us.fujitsu.com> wrote on 08/11/2005 11:50:53

PM:

 > Chris: 
 > Maybe I am a bit obtuse here but I just did not take that whole 
statement itself in an exclusive 
 > way (notwithstanding my notion of "complete" which was as good as

in

dictionary.reference.com)... 
 > I guess the "what is not forbidden is allowed" perspective. So I

think

some editorial tightening 
 > would help folks like me :-) 
 >  May I add, precisely because it is only about enabling RMD to 
efficiently reclaim resources 
 > associated with the Sequence, I saw use cases where  using 
<TerminateSequence> may be as 
 > legitimate for an incomplete sequence as for a complete one. So I

guess

that is these use cases 
 > that need be discussed - (let me download that Gil doc...) 
 > Thanks, 
 > Jacques 
 > 

 > -----Original Message----- 
 > From: Christopher B Ferris [mailto:chrisfer@us.ibm.com] 
 > Sent: Thursday, August 11, 2005 6:21 PM 
 > To: ws-rx@lists.oasis-open.org 
 > Subject: RE: [ws-rx] NEW ISSUE, twin sister of i019 
 > Jacques, 
 > The spec says at line 569: 
 > "After an RM Source receives the <SequenceAcknowledgement>

acknowledging

 > the complete range of messages in a Sequence, it sends a  element,

in

the 
 > body of a message to the RM Destination to indicate that the

Sequence is

 > complete, and that it will not be sending any further messages

related

to 
 > the Sequence. The RM Destination can safely reclaim any resources 
 > associated with the Sequence upon receipt of the  message." 
 > The key word in the first sentence is "complete", wherein the

definition

 > of "complete" in this context is the fourth one here: 
 >         http://dictionary.reference.com/search?q=complete 
 >         4. Absolute; total 
 > I always thought that it was pretty unambiguous, but maybe I am

mistaken

 > and it should instead read: 
 > After an RM Source receives the <SequenceAcknowledgement>

containing a

 > single <AcknowledgementRange> element with an @Upper valued at the 
 > MessageNumber of the <LastMessage> in a Sequence and the @Lower

valued

at 
 > "1". 
 > That was certainly the intent. 
 > The statement at 569 is followed up starting at line 580 with: 
 > "/wsrm:TerminateSequence 
 >         This element is sent by an RM Source after it has received

the

 > final <SequenceAcknowledgement> covering the full range of a

Sequence.

It 
 > indicates that the RM Destination can safely reclaim any resources 
related 
 > to the identified Sequence. This element MUST NOT be sent as a

header

 > block." 
 > This reinforces what I claimed on the call, that the purpose of 
 > TerminateSequence is to allow the RMD to reclaim any resources 
 > associated with the Sequence. It means that the RMS is done, finit,

kaput 
 > with that Sequence. 
 > I don't think that you should read into a spec, content which is

simply

 > not there: 
 > > I did NOT interpret it as: 
 > > 
 > > " an RM Source MUST NOT send <TerminateSequence> in other cases

where

it 
 > has not received full 
 > > acknowledgement of the complete range of messages in a Sequence."

 > But, maybe that is what is needed: 
 > "An RM Source MUST NOT send a <TerminateSequence> if it has any 
 > expectation of ever receiving information from the RM Destination 
 > about that Sequence after the <TerminateSequence> is transmitted." 
 > Again, the purpose of the <TerminateSequence> is to enable the RMD

to

 > efficiently reclaim any and all respources associated with 
 > the Sequence. It isn't necessary that the RMD ever receive this

message,

 > as the resources can still be reclaimed either at the Sequence 
expiration 
 > time or following a duration of inactivity, etc. However, the 
 > <TerminateSequence> message's purpose seems clear, to me at least. 
 > Cheers, 
 > Christopher Ferris 
 > STSM, Emerging e-business Industry Architecture 
 > email: chrisfer@us.ibm.com 
 > blog: http://webpages.charter.net/chrisfer/blog.html 
 > phone: +1 508 377 9295 
 > Jacques Durand <JDurand@us.fujitsu.com> wrote on 08/11/2005

08:16:56 PM:

 > > Giovanni: 
 > > 
 > > I think you are right on the spot about the misunderstanding that

we

had 
 > in the conf call today. 
 > > Indeed, I interpreted this statement of the spec as nothing more

than

 > the normal use for 
 > > TerminateSequence, non-exclusive of other uses: 
 > > 
 > > "After an RM Source receives the <SequenceAcknowledgement> 
acknowledging 
 > > The complete range of messages in a Sequence, it sends a 
 > <TerminateSequence> 
 > > element, in the body of a message to the RM Destination to

indicate

that 
 > > the Sequence is complete, and that it will not be sending any

further

 > > messages related to the Sequence." 
 > > 
 > > I did NOT interpret it as: 
 > > 
 > > " an RM Source MUST NOT send <TerminateSequence> in other cases

where

it 
 > has not received full 
 > > acknowledgement of the complete range of messages in a Sequence."

 > > 
 > > Therefore we were not discussing on the same base of premises. 
 > > 
 > > Your speculation is right: I assumed that seq termination (an 
operation 
 > that is more meaningful to 
 > > RMD than to RMS, given that the ending of a sequence has already

been

 > notified by LastMessage 
 > > sending ) may be appropriate in some cases where not all messages

have

 > been acked. My recent 
 > > rewording of the issue clarifies this a bit, but is still based

on the

 > same interpretation of 
 > > TerminateSequence, so I may need to shelve it and submit an issue

on

 > TerminateSequence instead. 
 > > 
 > > Note: I believe your last paragraph below just illustrates the

need

for 
 > clearly stating the valid 
 > > use cases as in Gil doc, so that we can sync up on these before

even

 > discussing the issues ... 
 > > 
 > > Regards, 
 > > 
 > > Jacques 
 > > 
 > > -----Original Message----- 
 > > From: Giovanni Boschi [mailto:gboschi@sonicsoftware.com] 
 > > Sent: Thursday, August 11, 2005 3:47 PM 
 > > To: Jacques Durand; ws-rx@lists.oasis-open.org 
 > > Subject: RE: [ws-rx] NEW ISSUE, twin sister of i019 
 > > 
 > > From the issue justification, "The specification is too lax on

the

 > > loophole that permits stray messages to 
 > > "sneak-in" just before a termination, without any opportunity to

be

 > > acknowledged" 
 > > 
 > > This is what the specification says: 
 > > 
 > > "After an RM Source receives the <SequenceAcknowledgement> 
acknowledging 
 > > the 
 > > complete range of messages in a Sequence, it sends a 
<TerminateSequence> 
 > > element, in the body of a message to the RM Destination to

indicate

that 
 > > the 
 > > Sequence is complete, and that it will not be sending any further

 > > messages related to 
 > > the Sequence." 
 > > 
 > > This, at least to me, says pretty clearly that a conformant RMS

will

not 
 > > send TerminateSequence until all messages have been acknowledged,

and

 > > that it will not send any new messages after sending 
TerminateSequence. 
 > > 
 > > To be sure, duplicates of messages previously sent (and

acknowledged)

 > > may arrive at the RMD after TerminateSequence.  But these are 
 > > duplicates, not unacknowledged messages. 
 > > 
 > > The specification has a definition of "normal termination" which 
 > > requires that all messages be acknowledged, and therefore the 
"situation 
 > > whereupon normal termination of a sequence some messages that

were

 > > previously send and never acknowledged..." is, by definition, 
 > > impossible.  The "accuracy of acknowledgments upon normal

sequence

 > > termination" is 100% perfect. 
 > > 
 > > Now, during the call today, Jacques seemed to suggest that what

was

 > > behind this is that a sender may need/want to terminate the

sequence

 > > prior to all messages being acked; this may well have merit, or

at the

 > > very least is worthy of discussion; the current spec clearly does

not

 > > allow it, and it is well within the responsibility of the TC to 
consider 
 > > such a change. 
 > > 
 > > But if the request is to change the definition of sequence 
termination, 
 > > or maybe to provide an additional type of termination, the issue 
 > > description should clearly say just that ("we should allow for

normal

 > > termination prior to acknowledgement of all messages"); but

nothing in

 > > the text of the issue suggests that we are looking for a change

in

 > > definition of normal termination. 
 > > 
 > > I don't know whether procedure allows revising the text of the

issue

for 
 > > clarity.  As it stands now both the description and justification

below 
 > > contain statements that appear to me to be factually incorrect,

or at

 > > best highly misleading.  It should not be surprising that this 
generates 
 > > long discussions in the confcall about whether to even accept it

as an

 > > issue. 
 > > 
 > > I will speculate that I may have an idea of what Jacques may be

after:

 > > a sender may, for a variety of reasons which we could discuss

(e.g. it

 > > is being shut down for maintenance longer than the sequence 
expiration), 
 > > be forced to stop resending; if so, it would be nice to know

which

 > > messages actually got delivered or didn't, so that it may send

the

 > > undelivered ones again later in another sequence w/o duplicating

them.

 > > AckRequested does not serve this purpose because, unless the

sequence

is 
 > > actually terminated, there may be more message out there in

flight

which 
 > > will actually arrive at the RMD and be delivered.  But I'm just 
 > > speculating, the issue doesn't say that. 
 > > 
 > > Jacques, please clarify. 
 > > 
 > > Giovanni. 
 > > 
 > > ________________________________________ 
 > > From: Jacques Durand [mailto:JDurand@us.fujitsu.com] 
 > > Sent: Thursday, August 04, 2005 6:50 PM 
 > > To: ws-rx@lists.oasis-open.org 
 > > Subject: [ws-rx] NEW ISSUE, twin sister of i019 
 > > 
 > > I realize that we should probably discuss this new issue in 
conjunction 
 > > with i019, i.e. before closing on i019. 
 > > (it is stating a similar problem, but for normal termination

cases.)

 > > 
 > > Daniel: 
 > > With the perspective of this new issue, I am leaning more toward

your

 > > proposal to mark as "last" the final sequence status. 
 > > 
 > > 
 > > Jacques 
 > > 
 > > 
 > > Title: Accuracy of acknowledgement status upon normal sequence 
 > > termination 
 > > 
 > > Description: The specification does not address the situation

where

upon 
 > > normal 
 > > termination of a sequence, some message that were previously sent

and

 > > never acknowledged 
 > > (for which RM Source had stop any resending effort) has been

received

 > > late by RM Destination, 
 > > e.g. between the sending of the last SequenceAcknowledgement and 
before 
 > > the reception of 
 > > a TerminateSequence message. This is the twin sister of issue

i019

which 
 > > deals with a similar 
 > > problem but in case of fault termination. 
 > > 
 > > Justification: Normal termination is actually a fairly common

event

 > > (compared to sequence fault) 
 > > and it is expected that sequences will be terminated even if they

have

 > > missing messages. 
 > > The specification is too lax on the loophole that permits stray 
messages 
 > > to 
 > > "sneak-in" just before a termination, without any opportunity to

be

 > > acknowledged. 
 > > 
 > > Target: core 
 > > Type: design 
 > > 
 > > Proposal: A final acknowledgement status could be sent back that 
 > > reflects the exact state 
 > > at termination time. That could be done by sending (or by making 
 > > available for polling 
 > > even after the sequence is terminated) a last

SequenceAcknowledgement

 > > element, at the time 
 > > the RM Destination terminates the sequence (either at reception

of

 > > TerminateSequence, 
 > > or due to timeout). Such a SequenceAcknowledgement element should

have

a 
 > > "last" marker. 
 > >

Follow-Ups:
- Re: [ws-rx] DA and Protocol: is the date over?
  - From: Rich Salz <rsalz@datapower.com>

References:
- Re: [ws-rx] DA and Protocol: is the date over?
  - From: Christopher B Ferris <chrisfer@us.ibm.com>