From: Doug Davis
[mailto:dug@us.ibm.com]
Sent: Thursday, August 25, 2005
8:36 PM
To: ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005
+1, LOL, another fan of "nudge" :-)
thanks,
-Doug
"Winkler, Steve"
<steve.winkler@sap.com>
08/25/2005 08:23 PM
|
To
|
"Giovanni Boschi"
<gboschi@sonicsoftware.com>, Doug Davis/Raleigh/IBM@IBMUS
|
cc
|
"Marc Goodner" <mgoodner@microsoft.com>,
<ws-rx@lists.oasis-open.org>
|
Subject
|
RE: [ws-rx] Issue i005
|
|
Hi Giovanni,
I don't think we're all that far off from agreement, but I do
have a few responses below. I'll keep gathering everyone's opinions and
try to make a proposal that will meet everyone's needs (but not necessarily
their wants;-).
Cheers,
Steve
From: Giovanni Boschi
[mailto:gboschi@sonicsoftware.com]
Sent: Thursday, Aug 25, 2005 5:00 PM
To: Winkler, Steve; Doug Davis
Cc: Marc Goodner; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005
My interpretation of the spec as it stands is also that a
Nack may be ignored, and I think that is a good thing.
Let me phrase it differently: what behavior can the RMD
reliably expect to *observe* from
the RMS as a result of the RMD sending the RMS a Nack? The answer is
simple: Nothing. And no language or semantics we discussed today would
change that - the reason is that the Nack itself is unreliable, and it may
never make it to the source at all. And, of course, even if the Nack does
make it to the RMS, the resend may not make it to the RMD. So,
fundamentally, the RMD can hope for, but cannot expect, to receive a copy of
the message in response to a Nack. Given that the RMD cannot depend on
it, it can’t possibly be an interop issue, and I see no reason not to
allow the RMS to ignore it, in effect behaving as if it never saw it, for its
own reasons. And it may have some very good reasons, some of which
I’ll mention below.
<sw>Acks also aren't guaranteed, but that doesn't make
them any less useful. A nack is used by the RMD to indicate to the RMS
that it still needs some message(s) in a sequence. There is the
(potentially expensive) possibility that the RMD must delay delivery of
messages until these messages have been received, and nudging the RMS to send
these messages ahead of their previously scheduled retransmission time is
something that I find very useful.</sw>
The question came up as to whether this is testable behavior,
and I would maintain that it is not – at first it appears testable, if
only in a contrived enviroment where the entire transport distance maintains
order and does not lose messages – that is, only in an environment where
the whole point of WSRM is moot. But then, of course, messages would not
be lost or reordered, so why would the RMD be sending a Nack in the first
place? Ignore that for a minute, and assume that the RMD did send a Nack,
and did receive a message soon thereafter; how does it know that it was
resent as a result of the Nack and not due to the regularly scheduled
retransmit interval?
<sw>It doesn't. But the point of the nack is to
indicate that the RMS should send a message as soon as possible. It can
certainly wait until it is ready, whether this is based on a backoff algorithm
or just when it has the necessary resources. The nack is a hint that the
RMD is missing some messages, nothing more.</sw>
Regarding the intent to “encourage 'better' implementations that follow recommended
optimization practices”, what are the
grounds for assuming that resending immediately is a “better”
optimization? What, after all, is being optimized? Is it latency?
Transmission costs? Disk space at the RMD? What if the RMS
knows that its next scheduled retransmit interval is 10ms away and would rather
just wait? What if the RMS simply cannot retransmit because it needs to
wait for its next satellite uplink window which is 10min away? What if
the RMS is operating at the bandwidth threshold of its network SLA, and the premature retransmit will cost it billable
dollars at overage rates? In my experience, there is no reason to assume
that RMS and RMD, particularly if they belong to different organizations, are
going to be interested in optimizing the same thing.
<sw>IMO, the purpose of the RM spec is to get a message
from point A to point B and guarantee that that happens. Optimizing this
is, indeed, in the eye of the beholder. An optimal solution from an RMS
POV may be very different from that of an RMD point of view. We need to
put in place the framework to make both happy though, which even means allowing
the two parties to work together if they want to.</sw>
The proposal would hand over control of retransmit policy
entirely to the RMD, in the hope that it would use it wisely - without any
suggestion of what that means. Consider a half-duplex failure of the
transport layer (e.g. a reverse proxy firewall is down, or a satellite connected
node has downlink but no uplink?), and RMD can send to RMS, but not the other
way around. The RMS sends a few times and gets no Acks, and starts
backing off exponentially; and then starts receiving regular, perhaps frequent
Nacks from the RMD which has no clue what is going on. As proposed, the
RMS would have to cancel its backoff algorithm, and start retransmitting on
every Nack, even after it had concluded that it should back off because
previous resends had obviously not been successful. Why is this good?
Or would we then raise an issue to clarify that, in this specific case,
the RMS is allowed to ignore the Nacks?
<sw>How? I don't see at all how the proposal
would hand over control of retransmission to the RMD. The RMS still has
it's timed retransmission policy. The nacks are simply used as hints to
the RMS that it would be appreciated if they sent message xyz sooner rather
than later because the RMD is consuming resources while waiting for its
arrival.</sw>
I do not believe that using the word “SHOULD”
avoids the above issues – the definition of SHOULD in RFC 2119, but more
generally the use of any RFC2119 terminology, clearly implies that the behavior
is a “requirement” that should be met except perhaps for niche
situations, and I disagree strongly with that implication. The Nack is
currently a hint; not a noop, but a hint, and the choice whether to resend is
entirely at the RMS’ discretion. If the issue is to clarify that
fact, then the previous sentence, or anyone else’s version, seems like it
would do. If the proposal is to make it a binding contract on the RMS, I
would strongly oppose it, for all the above reasons.
<sw>Define a niche situation. That seems rather
vague and open to interpretation to me. I agree that the nack is a hint,
but a hint with well defined semantics that says that the RMD would be better
off if it could get some of these messages sooner. In fact, the RMS would
probably be better off too because there are messages in the sequence that it
had created that are hanging around on the receiving side and not being
processed. If it's in the best interest of the RMS to have those messages
processed in a timely manner, then it had better well get the missing message
over to the other side.</sw>
G.
From: Winkler, Steve
[mailto:steve.winkler@sap.com]
Sent: Thursday, August 25, 2005 4:20 PM
To: Doug Davis
Cc: Marc Goodner; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005
Hi Doug,
As I mentioned, the actual wording can be left to the editors
as long as the semantics have been agreed upon. The proposal was sent
simply as a starting point for discussion. My interpretation of the spec
as it stands is that a nack can be ignored. The proposal was to add words
to state that messages should be retransmitted in response to a Nack, but given
that nacks are intended only for performance optimization, I don't think that
this behavior should be required. I do think, however, that we should
encourage 'better' implementations that follow recommended optimization
practices unless a specific implementation instance has sufficient reason not
to.
Cheers,
Steve
From: Doug Davis
[mailto:dug@us.ibm.com]
Sent: Thursday, Aug 25, 2005 12:59 PM
To: Winkler, Steve
Cc: Marc Goodner; ws-rx@lists.oasis-open.org
Subject: RE: [ws-rx] Issue i005
The wording seems a bit backwards to me. I think that retransmission upon
receipt
of a NACK is required except when the message as been previously ACKd. The
text
as stated implies that a NACK can be ignored even when its has not been ACKd.
(Why do I feel like Bill The Cat :-)
I'd prefer something more like:
An RM Source MAY choose to not retransmit the message corresponding to
<Nack> element in cases where it previously received an acknowledgement
for
that message.
Although, I also see no reason to not go one more step and say that the RMS
should always ignore the NACK in those cases too. But this isn't a bad
compromise.
thanks,
-Doug
"Winkler, Steve"
<steve.winkler@sap.com>
08/25/2005 02:16 PM
|
To
|
"Marc Goodner"
<mgoodner@microsoft.com>, <ws-rx@lists.oasis-open.org>
|
cc
|
|
Subject
|
RE: [ws-rx] Issue i005
|
|
Hi Marc,
The more compact wording is fine with me with the exception that you've
changed the SHOULD to a MAY. I personally like the stronger wording,
but understand that it is an optional performance enhancement. I think
the semantic difference between SHOULD and MAY should be decided by the
TC.
I also think that we may need to investigate the ability to advertise
the ability for destinations to trigger retransmissions with Nacks as a
policy. I'll look into this a little more and possibly raise a separate
issue to address this.
Cheers,
Steve
> -----Original Message-----
> From: Marc Goodner [mailto:mgoodner@microsoft.com]
> Sent: Thursday, Aug 25, 2005 9:58 AM
> To: Winkler, Steve; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] Issue i005
>
> Steve,
>
> I propose the following more succinct wording to be used at
> Section 3.2,
> after line 338 in place of what you have proposed below.
>
> "The RM Source MAY retransmit the message corresponding to the
<Nack>
> element when it has not already received an acknowledgement for that
> message."
>
> -----Original Message-----
> From: Winkler, Steve [mailto:steve.winkler@sap.com]
> Sent: Thursday, August 18, 2005 2:11 PM
> To: ws-rx@lists.oasis-open.org
> Subject: [ws-rx] Issue i005
>
>
>
> Issue i005 was originally brought up at the F2F, not
> necessarily by me,
> but I was an active part of the discussion and therefore would like to
> continue that discussion on the list. I've included some background
> information to expound upon the description and I've followed with a
> concrete proposal that I hope the TC will consider as a starting point
> for resolving the issue.
>
> Cheers,
> Steve
>
>
> Background
>
> The asynchronous nature of acks, as well as line 336/337 of the spec
> indicate that 'The RM Destination MAY send a
<SequenceAcknowledgement>
> header block at any point.' In certain cases, this could result in
an
> Ack for a given message overtaking an Nack for the same message. The
> rationale given for the Nack is that the gap analysis can be performed
> on the RMD side resulting in performance enhancements (see lines
> 376-378). In the case that the Nack overtakes an Ack, the Nack could
> actually go against the spirit of the spec and result in performance
> degradation by triggering the retransmission of a message that has
> already been received by the RMD. It should be noted that receipt of
> this message by the RMD a second time is in no way an error,
> but simply
> an unnecessary inefficiency in the protocol for this edge
> case. Better
> implementations would have probably avoided this anyway.
>
> Several people during the discussion at the F2F mentioned
> that there may
> be some benefit in Nacking a message to trigger resending of a message
> that has already been received by the RMD. Whereas I can see
> that this
> may be true, it seems like this kind of functionality would need to
> happen at a layer above RM (i.e. it's out of scope for this spec).
> However, given that messages can be delivered multiple times
> anyway, it
> would not be the end of the world to retransmit a message that has
> already been delivered to the RMD. Therefore I would not want to
> preclude an implementation that wants to '(ab)use' this fact
> from being
> allowed to use the RM machinery already in place to achieve this.
>
> I also noticed while investigating this issue that there
> doesn't seem to
> be any explicit indication in the spec that a Nack message should
> trigger the resend of a message. I believe that this is implied, but
> since it's an optimization, it is also most likely not
> intended to be a
> requirement.
>
> Proposal: Add text to the spec to explicitly state what an
> RMS should do
> when it receives a Nack message and tighten the spec for the edge case
> described above in the following manner (wordsmithing can be
> done later,
> but you get the gist):
>
> In Section 3.2, after line 338 add something like this: 'If the RM
> Source should receive a <SequenceAcknowledgement>containing a
> Nack, the
> RM Source SHOULD retransmit the message corresponding to the
> Nack. After
> the notification of successful receipt of a given message by the RM
> Destination, the RM Source SHOULD NOT attempt to retransmit
> the message
> in the event that it receives a negative acknowledgement for it at a
> later point in time.'
>
>
> ------------------
> Steve Winkler
> SAP AG
>