[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [ws-rx] Issue i022, RM Assertions
Vikas, Are you optimistic or pessimistic? If you are optimistic, and you expect that
messages will usually be received and acknowledgements not lost, then you might
consider that re-transmissions are part of error recovery. Examples of optimistic systems might
include non-blocking crossbar connected multiprocessors, or even what we have come
to expect via normal high speed wired network connectivity, examples of
pessimistic systems included most radio based communications. A pessimistic system might operate with a
high re-try ratio. It is also true that the optimal
parameters may change during the duration of a connection. The problem is that nobody knows, in the
general case, what is the nature of the stuff between sender and
receiver. This implies that what is needed is a mechanism that can cover
the ranges of possibilities. As far as I know, no static parameter set
can cover these eventualities. That is why I propose that a mechanism similar
to that which is used in RFC 1323 is indicated. In that mechanism, the
parameters are learned through a moving average mechanism based on actual
measured response timed. Other systems, such as my cross-bar
interconnected multiprocessors, would use possibly a hardware assisted
mechanism. Thanks -bob From: Vikas Deolaliker
[mailto:vikas@sonoasystems.com] Bob, I realize it not like the flow control
like done in lower layers of transports. But none the less it is a control of flow
because when you implement it, it affects exchange of messages between RMS and
RMD. And when it goes wrong, you compromise reliable exchange which is the
purpose of this spec. Vikas From: Bob Freund [mailto: Vikas, I don’t think that these parameters
have much to do with traditional flow control, just the re-try behaviors of
each end. The delay times and intervals are not
interoperability concerns as much as they are path and performance
optimizations. Flow control, if I understand you
correctly, is something like the sdlc mechanism of rr/rnr whereby the sender
could ask if the receiver was ready to receive a message or not. This
tends to be part of much lower level protocols involving physical
transport. The spec assumes that there is some sort of transport underneath
that has the responsibility of managing what one might call (to coin a term)
the link layer or transport layer. This protocol depends on retry to achieve
reliability, but the timing characteristics have the only impact of swamping
the channel if too short and reducing errpr recovery and thus performance if
too long. Even with negotiated algorithms, it is not
predictable what the error rates on the channels might be. Oftentimes,
errors are bursty and what works very well under normal circumstances will fail
during a burst. This what we applied in the discussion leading to RFC1323 No, I believe that the parameters
BaseRetransmission, ExponentialBackoff and AcknowledgementInterval all need to
be removed, but a discussion of retransmission should be added to the base
spec, otherwise, we don’t have a reliability protocol. Thanks -bob From: Vikas Deolaliker
[mailto:vikas@sonoasystems.com] Bob, I agree with you in so far as implementers
will implement flow control in ways suitable for them. Ideally, what is needed
is a mechanism for the RMS and RMD to negotiate and agree upon a flow control
algorithm. Part of this negotiation would entail exchange of schema related to
parameters necessary to follow the algorithm. Should such a mechanism be
created by this WG, it should then be part of the core reliable messaging
protocol and not in the assertions model. Vikas From: Bob Freund
[mailto: Retransmission parameters as well as
algorithms are problematic for the following reasons: 1) The characteristics of the path from source to destination are
often unknown and often are time-variant. 2) 2) Retransmissions if too frequent cause flooding and potential
catastrophic degradation if the path is near saturation 3) The Path may consist of not only transmission means, but also
intermediaries with attendant processing delays 4) Exponential backoff may be implemented many ways, there is more
than one algorithm any they have different parameters 5) Backoff algorithm selection may be implementation specific, what is
good for cell phones may not be good for cluster interconnected nodes 6) I have found no theoretical modeling available of the case of web
services cum intermediaries 7) Most published data concerning the behavior of backoff algorithms
examine fairly simple network segment related saturation and do not address
client, server, let alone intermediary saturation. 8) Exponential backoff algorithms need a recovery mechanism for those
situations where there is a high standard deviation of delay. 9) TCP/IP experience has shown that efficiencies are improved with an
adaptive mechanism as described in TCP Extensions for High Performance (see RFC
1323 RTTM) Proposal: Clearly
a backoff mechanism is required; however implementation specific needs are not
served well by the selection of any specific algorithm for all potential
implementations of this specification. It is recommended that
implementers utilizing IP based transmission media consider the mechanism
described in RFC 1323. Delete all re-transmission parameters as described
in the specification since they are unnecessary and unhelpful should the
implementer use an algorithm with a different set of controls. Thanks -bob From: Vikas Deolaliker
[mailto:vikas@sonoasystems.com] Description: (revised) The RM policy assertions, specifically, InActivityTimeout,
BaseRetransmissionInterval and ExponentialBackoff parameters need to be more
finely specified. The following are the areas which need finer specification a) Default
Value for InActivityTimeout, BaseRetransmissionInterval and ExponentialBackoff:
There needs to be a
default set for these parameters. Currently the specification says “If
omitted, there is no implied value.” Since these parameters dictate the
delivery of the message, an implementation is going to assume a default
anyways. Not specifying this will make implementations assume a different
default value and cause unwanted timeouts. b) Definition
of InActivity There needs to be a discussion of
definition of inactivity. If RMS sends a sequence to RMD and is waiting for the
response which is delayed for whatever reason, is that inactivity on the link
between RMS and RMD counted towards InActivityTimeout? If yes, then it is
entirely possible that while waiting for a sequence response, RMS could timeout
due to InActivity. c) Applicability
of InActivityTimeout: It needs to be
specified to which end this parameter is applicable. It seems like sequence
creator starts the timer for InActivityTimeout. If the intention is that this
timer exists on both ends of a sender and receiver engaged in a RM sequence, we
need to define a method for synchronization of the timer value of this
parameter between them. For example an KeepAlive message would need to be
defined for keeping sequence alive. d) Corner Case
Handling: There needs to be a
discussion of the corner case when the BaseRetransmissionInterval exceeds
InActivityTimeout. This can happen when the RMD is indisposed and
ExponentialBackoff drives up the value of BaseRetransmissionInterval. In this
case my retransmission is schedule later than the timeout that I need to abide to.
What state does the RMS enter in this situation? e) BaseRetransmissionInterval
Needs an Upper Bound: If an RMD is offline for extended
period of time, one can expect the BaseRetransmissionInterval to be
exponentially backed off i.e. become large enough to be not meaningful anymore.
Having an upper bound on this parameter will enable the RMS to stop
retransmitting and report a fault. Proposal: (revised) 1) InActivityTimeout
and BaseRetransmissionInterval can be merged into one i.e.
BaseRetransmissionTimeout. Having just one counter on the RMS and RMD will
reduce the run-time resources (much simpler state machine) required to
implement RM-Assertions and avoid confusion (unknown states in state machine)
caused by two timeouts. Having a separate timeout for sequence and
retransmission may not be necessary as activity on the RM link is
transmission/retransmission. I believe one timeout i.e.
BaseRetransmissionTimeout does not change the behavior of the system. Once this
timeout occurs the sequence has to timeout as the implication of the timeout is
the destination is either congested or offline. 2) If
InActivityTimeout has to be there as a parameter, we need to fully specify it
with mechanisms for synchronization and keepalive. In addition, we need to
discuss how the corner cases and other conflicts that occur when one has two
timeout (as discussed in a-e above) are handled. Vikas Sonoa Systems, Inc. (408) 748-1730 x100 |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]