RE: [ws-rx] Issue i005

I would definitely agree with defining this as a hint, or a nudge ( “say no more…” ) or anything along those lines. I think it’s actually something stronger than that, it’s a “request” – I just think the RMS should have the latitude to ignore it or at least postpone it. There was at least some talk on the call about making this a “binding contract”, the word “immediately” was used a couple of times, and that is what I wanted to argue against.

My preference for not using RFC2119 language to draw these lines is largely philosophical, and therefore I can get over it if need be. But, to me at least, SHOULD tends to mean that if you don’t do it, you’re less conformant than if you do, and that’s the implication that I’d rather avoid. And to be clear, I’m not looking for implementations to be hardcoded to never resend on Nack, which I would agree would be weaker implementations. I’m looking to retain the option of the RMS’s considering the RMD’s “request”, but in the context of the configured RMS retransmit policy; for some customers the configured retransmit policy may need to take priority at a particular instant in time.

The point is well taken about the parallels to AckRequested, the issues are virtually identical, and in that case the spec says “MUST”. And I guess I hadn’t thought about it, but the reason I haven’t had a problem with that is that I have read that requirement to imply “as soon as possible” – it doesn’t say that, but it also doesn’t say *when* I MUST send the ack (see the discussion on “anonymous AcksTo” for situations when it’s at least argued that you shouldn’t respond right away) So I’d even be open to using the word MUST for resending on Nack, as long as it is clear that it doesn’t necessarily mean “this instant”. I actually think the language in 405-412 of the current draft is appropriate for AckRequested, and I would be comfortable with similar language for Nack, even more so if we could add “As soon as possible”.

Thanks,

From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Thursday, August 25, 2005 8:36 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005

+1, LOL, another fan of "nudge" :-)
thanks,
-Doug

"Winkler, Steve" <steve.winkler@sap.com>

08/25/2005 08:23 PM

To	"Giovanni Boschi" <gboschi@sonicsoftware.com>, Doug Davis/Raleigh/IBM@IBMUS
cc	"Marc Goodner" <mgoodner@microsoft.com>, <ws-rx@lists.oasis-open.org>
Subject	RE: [ws-rx] Issue i005

Hi Giovanni,

I don't think we're all that far off from agreement, but I do have a few responses below. I'll keep gathering everyone's opinions and try to make a proposal that will meet everyone's needs (but not necessarily their wants;-).

Cheers,
Steve

From: Giovanni Boschi [mailto:gboschi@sonicsoftware.com]
Sent: Thursday, Aug 25, 2005 5:00 PM
To: Winkler, Steve; Doug Davis
Cc: Marc Goodner; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005

My interpretation of the spec as it stands is also that a Nack may be ignored, and I think that is a good thing.

Let me phrase it differently: what behavior can the RMD reliably expect to *observe* from the RMS as a result of the RMD sending the RMS a Nack? The answer is simple: Nothing. And no language or semantics we discussed today would change that - the reason is that the Nack itself is unreliable, and it may never make it to the source at all. And, of course, even if the Nack does make it to the RMS, the resend may not make it to the RMD. So, fundamentally, the RMD can hope for, but cannot expect, to receive a copy of the message in response to a Nack. Given that the RMD cannot depend on it, it can’t possibly be an interop issue, and I see no reason not to allow the RMS to ignore it, in effect behaving as if it never saw it, for its own reasons. And it may have some very good reasons, some of which I’ll mention below.

<sw>Acks also aren't guaranteed, but that doesn't make them any less useful. A nack is used by the RMD to indicate to the RMS that it still needs some message(s) in a sequence. There is the (potentially expensive) possibility that the RMD must delay delivery of messages until these messages have been received, and nudging the RMS to send these messages ahead of their previously scheduled retransmission time is something that I find very useful.</sw>

The question came up as to whether this is testable behavior, and I would maintain that it is not – at first it appears testable, if only in a contrived enviroment where the entire transport distance maintains order and does not lose messages – that is, only in an environment where the whole point of WSRM is moot. But then, of course, messages would not be lost or reordered, so why would the RMD be sending a Nack in the first place? Ignore that for a minute, and assume that the RMD did send a Nack, and did receive a message soon thereafter; how does it know that it was resent as a result of the Nack and not due to the regularly scheduled retransmit interval?

<sw>It doesn't. But the point of the nack is to indicate that the RMS should send a message as soon as possible. It can certainly wait until it is ready, whether this is based on a backoff algorithm or just when it has the necessary resources. The nack is a hint that the RMD is missing some messages, nothing more.</sw>

Regarding the intent to “encourage 'better' implementations that follow recommended optimization practices”, what are the grounds for assuming that resending immediately is a “better” optimization? What, after all, is being optimized? Is it latency? Transmission costs? Disk space at the RMD? What if the RMS knows that its next scheduled retransmit interval is 10ms away and would rather just wait? What if the RMS simply cannot retransmit because it needs to wait for its next satellite uplink window which is 10min away? What if the RMS is operating at the bandwidth threshold of its network SLA, and the premature retransmit will cost it billable dollars at overage rates? In my experience, there is no reason to assume that RMS and RMD, particularly if they belong to different organizations, are going to be interested in optimizing the same thing.

<sw>IMO, the purpose of the RM spec is to get a message from point A to point B and guarantee that that happens. Optimizing this is, indeed, in the eye of the beholder. An optimal solution from an RMS POV may be very different from that of an RMD point of view. We need to put in place the framework to make both happy though, which even means allowing the two parties to work together if they want to.</sw>

The proposal would hand over control of retransmit policy entirely to the RMD, in the hope that it would use it wisely - without any suggestion of what that means. Consider a half-duplex failure of the transport layer (e.g. a reverse proxy firewall is down, or a satellite connected node has downlink but no uplink?), and RMD can send to RMS, but not the other way around. The RMS sends a few times and gets no Acks, and starts backing off exponentially; and then starts receiving regular, perhaps frequent Nacks from the RMD which has no clue what is going on. As proposed, the RMS would have to cancel its backoff algorithm, and start retransmitting on every Nack, even after it had concluded that it should back off because previous resends had obviously not been successful. Why is this good? Or would we then raise an issue to clarify that, in this specific case, the RMS is allowed to ignore the Nacks?

<sw>How? I don't see at all how the proposal would hand over control of retransmission to the RMD. The RMS still has it's timed retransmission policy. The nacks are simply used as hints to the RMS that it would be appreciated if they sent message xyz sooner rather than later because the RMD is consuming resources while waiting for its arrival.</sw>

I do not believe that using the word “SHOULD” avoids the above issues – the definition of SHOULD in RFC 2119, but more generally the use of any RFC2119 terminology, clearly implies that the behavior is a “requirement” that should be met except perhaps for niche situations, and I disagree strongly with that implication. The Nack is currently a hint; not a noop, but a hint, and the choice whether to resend is entirely at the RMS’ discretion. If the issue is to clarify that fact, then the previous sentence, or anyone else’s version, seems like it would do. If the proposal is to make it a binding contract on the RMS, I would strongly oppose it, for all the above reasons.

<sw>Define a niche situation. That seems rather vague and open to interpretation to me. I agree that the nack is a hint, but a hint with well defined semantics that says that the RMD would be better off if it could get some of these messages sooner. In fact, the RMS would probably be better off too because there are messages in the sequence that it had created that are hanging around on the receiving side and not being processed. If it's in the best interest of the RMS to have those messages processed in a timely manner, then it had better well get the missing message over to the other side.</sw>

G.

From: Winkler, Steve [mailto:steve.winkler@sap.com]
Sent: Thursday, August 25, 2005 4:20 PM
To: Doug Davis
Cc: Marc Goodner; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005

Hi Doug,

As I mentioned, the actual wording can be left to the editors as long as the semantics have been agreed upon. The proposal was sent simply as a starting point for discussion. My interpretation of the spec as it stands is that a nack can be ignored. The proposal was to add words to state that messages should be retransmitted in response to a Nack, but given that nacks are intended only for performance optimization, I don't think that this behavior should be required. I do think, however, that we should encourage 'better' implementations that follow recommended optimization practices unless a specific implementation instance has sufficient reason not to.

Cheers,
Steve

From: Doug Davis [mailto:dug@us.ibm.com]
Sent: Thursday, Aug 25, 2005 12:59 PM
To: Winkler, Steve
Cc: Marc Goodner; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005

The wording seems a bit backwards to me. I think that retransmission upon receipt
of a NACK is required except when the message as been previously ACKd. The text
as stated implies that a NACK can be ignored even when its has not been ACKd.
(Why do I feel like Bill The Cat :-)

I'd prefer something more like:

An RM Source MAY choose to not retransmit the message corresponding to
<Nack> element in cases where it previously received an acknowledgement for
that message.

Although, I also see no reason to not go one more step and say that the RMS
should always ignore the NACK in those cases too. But this isn't a bad compromise.

thanks,
-Doug

"Winkler, Steve" <steve.winkler@sap.com>

08/25/2005 02:16 PM

To	"Marc Goodner" <mgoodner@microsoft.com>, <ws-rx@lists.oasis-open.org>
cc
Subject	RE: [ws-rx] Issue i005

Hi Marc,

The more compact wording is fine with me with the exception that you've
changed the SHOULD to a MAY. I personally like the stronger wording,
but understand that it is an optional performance enhancement. I think
the semantic difference between SHOULD and MAY should be decided by the
TC.

I also think that we may need to investigate the ability to advertise
the ability for destinations to trigger retransmissions with Nacks as a
policy. I'll look into this a little more and possibly raise a separate
issue to address this.

Cheers,
Steve

> -----Original Message-----
> From: Marc Goodner [mailto:mgoodner@microsoft.com]
> Sent: Thursday, Aug 25, 2005 9:58 AM
> To: Winkler, Steve; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] Issue i005
>
> Steve,
>
> I propose the following more succinct wording to be used at
> Section 3.2,
> after line 338 in place of what you have proposed below.
>
> "The RM Source MAY retransmit the message corresponding to the <Nack>
> element when it has not already received an acknowledgement for that
> message."
>
> -----Original Message-----
> From: Winkler, Steve [mailto:steve.winkler@sap.com]
> Sent: Thursday, August 18, 2005 2:11 PM
> To: ws-rx@lists.oasis-open.org
> Subject: [ws-rx] Issue i005
>
>
>
> Issue i005 was originally brought up at the F2F, not
> necessarily by me,
> but I was an active part of the discussion and therefore would like to
> continue that discussion on the list. I've included some background
> information to expound upon the description and I've followed with a
> concrete proposal that I hope the TC will consider as a starting point
> for resolving the issue.
>
> Cheers,
> Steve
>
>
> Background
>
> The asynchronous nature of acks, as well as line 336/337 of the spec
> indicate that 'The RM Destination MAY send a <SequenceAcknowledgement>
> header block at any point.' In certain cases, this could result in an
> Ack for a given message overtaking an Nack for the same message. The
> rationale given for the Nack is that the gap analysis can be performed
> on the RMD side resulting in performance enhancements (see lines
> 376-378). In the case that the Nack overtakes an Ack, the Nack could
> actually go against the spirit of the spec and result in performance
> degradation by triggering the retransmission of a message that has
> already been received by the RMD. It should be noted that receipt of
> this message by the RMD a second time is in no way an error,
> but simply
> an unnecessary inefficiency in the protocol for this edge
> case. Better
> implementations would have probably avoided this anyway.
>
> Several people during the discussion at the F2F mentioned
> that there may
> be some benefit in Nacking a message to trigger resending of a message
> that has already been received by the RMD. Whereas I can see
> that this
> may be true, it seems like this kind of functionality would need to
> happen at a layer above RM (i.e. it's out of scope for this spec).
> However, given that messages can be delivered multiple times
> anyway, it
> would not be the end of the world to retransmit a message that has
> already been delivered to the RMD. Therefore I would not want to
> preclude an implementation that wants to '(ab)use' this fact
> from being
> allowed to use the RM machinery already in place to achieve this.
>
> I also noticed while investigating this issue that there
> doesn't seem to
> be any explicit indication in the spec that a Nack message should
> trigger the resend of a message. I believe that this is implied, but
> since it's an optimization, it is also most likely not
> intended to be a
> requirement.
>
> Proposal: Add text to the spec to explicitly state what an
> RMS should do
> when it receives a Nack message and tighten the spec for the edge case
> described above in the following manner (wordsmithing can be
> done later,
> but you get the gist):
>
> In Section 3.2, after line 338 add something like this: 'If the RM
> Source should receive a <SequenceAcknowledgement>containing a
> Nack, the
> RM Source SHOULD retransmit the message corresponding to the
> Nack. After
> the notification of successful receipt of a given message by the RM
> Destination, the RM Source SHOULD NOT attempt to retransmit
> the message
> in the event that it receives a negative acknowledgement for it at a
> later point in time.'
>
>
> ------------------
> Steve Winkler
> SAP AG
>

ws-rx message