[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [ws-rx] Issue i022, RM Assertions
Its easy to pick some values for which the protocol won't work: InactivityTimeout = {no value} BaseRetransmissionInterval = {no value} Since "no value" does not imply any default, its possible that the RMS will settle on a retransmission interval of 5 seconds while the RMD decides to use an inactivity timeout of 3 seconds. The protocol will now work only if there are no lost messages. On the first lost message the RMS will wait 5 seconds before retransmitting by which time the RMD will have terminted the sequence due to inactivity. - g > -----Original Message----- > From: Tom Rutt [mailto:tom@coastin.com] > Sent: Thursday, October 20, 2005 3:23 PM > To: vikas@sonoasystems.com > Cc: 'Bob Freund'; ws-rx@lists.oasis-open.org > Subject: Re: [ws-rx] Issue i022, RM Assertions > > Vikas Deolaliker wrote: > > My comments are inline: > > > Bob, > > > > We are all pessimistic that is why we are trying to add a layer of > > reliability on top of TCP. > > > TCP is under an http request response. However, if the tcp > connection goes down before the response is received, http > has no way to recover. This is where ws Reliable messaging > comes into play. > > > BTW, we are in agreement but looks like it is a violent one. > > > > I agree with you that these parameters are not static and > so assertion > > mechanism is the wrong way to introduce them into a any reliable > > exchange system. Where they should be introduced is in the > mechanisms > > of the system which are dynamic. So ideally this should be > part of the > > protocol. > > > The protocol works regardless of the parameter values. > > As long as the rms re-transmits until it gets an ack response > for a message, the protocol will work. > > Tom Rutt > > > If we agree to the above, again I agree with you that two > ends cannot > > declare these parameters statically but continuously adjust > them based > > on traffic pattern. RFC 1323 is a good starting point, but the > > mechanisms that we borrow from it should be part of the > core protocol. > > > > So I guess I am agreeing with you on everything but I get > the feeling > > people think our views are divergent. > > > > Vikas > > > > > ---------------------------------------------------------------------- > > -- > > > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com] > > *Sent:* Thursday, October 20, 2005 1:42 PM > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions > > > > Vikas, > > > > Are you optimistic or pessimistic? > > > > If you are optimistic, and you expect that messages will usually be > > received and acknowledgements not lost, then you might > consider that > > re-transmissions are part of error recovery. > > > > Examples of optimistic systems might include non-blocking crossbar > > connected multiprocessors, or even what we have come to expect via > > normal high speed wired network connectivity, examples of > pessimistic > > systems included most radio based communications. > > > > A pessimistic system might operate with a high re-try ratio. > > > > It is also true that the optimal parameters may change during the > > duration of a connection. > > > > The problem is that nobody knows, in the general case, what is the > > nature of the stuff between sender and receiver. This implies that > > what is needed is a mechanism that can cover the ranges of > possibilities. > > > > As far as I know, no static parameter set can cover these > > eventualities. That is why I propose that a mechanism > similar to that > > which is used in RFC 1323 is indicated. In that mechanism, the > > parameters are learned through a moving average mechanism based on > > actual measured response timed. > > > > Other systems, such as my cross-bar interconnected multiprocessors, > > would use possibly a hardware assisted mechanism. > > > > Thanks > > > > -bob > > > > > ---------------------------------------------------------------------- > > -- > > > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com] > > *Sent:* Thursday, October 20, 2005 4:13 PM > > *To:* Bob Freund; ws-rx@lists.oasis-open.org > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions > > > > Bob, > > > > I realize it not like the flow control like done in lower layers of > > transports. > > > > But none the less it is a control of flow because when you > implement > > it, it affects exchange of messages between RMS and RMD. > And when it > > goes wrong, you compromise reliable exchange which is the > purpose of > > this spec. > > > > Vikas > > > > > ---------------------------------------------------------------------- > > -- > > > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com] > > *Sent:* Thursday, October 20, 2005 12:13 PM > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions > > > > Vikas, > > > > I don't think that these parameters have much to do with > traditional > > flow control, just the re-try behaviors of each end. > > > > The delay times and intervals are not interoperability concerns as > > much as they are path and performance optimizations. > > > > Flow control, if I understand you correctly, is something like the > > sdlc mechanism of rr/rnr whereby the sender could ask if > the receiver > > was ready to receive a message or not. This tends to be > part of much > > lower level protocols involving physical transport. The > spec assumes > > that there is some sort of transport underneath that has the > > responsibility of managing what one might call (to coin a term) the > > link layer or transport layer. > > > > This protocol depends on retry to achieve reliability, but > the timing > > characteristics have the only impact of swamping the channel if too > > short and reducing errpr recovery and thus performance if too long. > > > > Even with negotiated algorithms, it is not predictable what > the error > > rates on the channels might be. Oftentimes, errors are > bursty and what > > works very well under normal circumstances will fail during a burst. > > This what we applied in the discussion leading to RFC1323 > > > > No, I believe that the parameters BaseRetransmission, > > ExponentialBackoff and AcknowledgementInterval all need to > be removed, > > but a discussion of retransmission should be added to the > base spec, > > otherwise, we don't have a reliability protocol. > > > > Thanks > > > > -bob > > > > > ---------------------------------------------------------------------- > > -- > > > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com] > > *Sent:* Thursday, October 20, 2005 1:58 PM > > *To:* Bob Freund; ws-rx@lists.oasis-open.org > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions > > > > Bob, > > > > I agree with you in so far as implementers will implement > flow control > > in ways suitable for them. Ideally, what is needed is a > mechanism for > > the RMS and RMD to negotiate and agree upon a flow control > algorithm. > > Part of this negotiation would entail exchange of schema related to > > parameters necessary to follow the algorithm. Should such a > mechanism > > be created by this WG, it should then be part of the core reliable > > messaging protocol and not in the assertions model. > > > > Vikas > > > > > ---------------------------------------------------------------------- > > -- > > > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com] > > *Sent:* Thursday, September 22, 2005 7:26 AM > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions > > > > Retransmission parameters as well as algorithms are problematic for > > the following reasons: > > > > 1) The characteristics of the path from source to destination are > > often unknown and often are time-variant. > > > > 2) 2) Retransmissions if too frequent cause flooding and potential > > catastrophic degradation if the path is near saturation > > > > 3) The Path may consist of not only transmission means, but also > > intermediaries with attendant processing delays > > > > 4) Exponential backoff may be implemented many ways, there is more > > than one algorithm any they have different parameters > > > > 5) Backoff algorithm selection may be implementation > specific, what is > > good for cell phones may not be good for cluster > interconnected nodes > > > > 6) I have found no theoretical modeling available of the > case of web > > services cum intermediaries > > > > 7) Most published data concerning the behavior of backoff > algorithms > > examine fairly simple network segment related saturation and do not > > address client, server, let alone intermediary saturation. > > > > 8) Exponential backoff algorithms need a recovery mechanism > for those > > situations where there is a high standard deviation of delay. > > > > 9) TCP/IP experience has shown that efficiencies are > improved with an > > adaptive mechanism as described in TCP Extensions for High > Performance > > (see RFC 1323 RTTM) > > > > Proposal: > > > > Clearly a backoff mechanism is required; however implementation > > specific needs are not served well by the selection of any specific > > algorithm for all potential implementations of this > specification. It > > is recommended that implementers utilizing IP based > transmission media > > consider the mechanism described in RFC 1323. Delete all > > re-transmission parameters as described in the specification since > > they are unnecessary and unhelpful should the implementer use an > > algorithm with a different set of controls. > > > > Thanks > > > > -bob > > > > > ---------------------------------------------------------------------- > > -- > > > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com] > > *Sent:* Thursday, August 04, 2005 9:53 AM > > *To:* ws-rx@lists.oasis-open.org > > *Subject:* [ws-rx] Issue i022, RM Assertions > > > > Description: > > > > (revised) > > > > The RM policy assertions, specifically, InActivityTimeout, > > BaseRetransmissionInterval and ExponentialBackoff > parameters need to > > be more finely specified. > > > > The following are the areas which need finer specification > > > > a) Default Value for InActivityTimeout, > BaseRetransmissionInterval and > > ExponentialBackoff: > > > > There needs to be a default set for these parameters. Currently the > > specification says "If omitted, there is no implied value." Since > > these parameters dictate the delivery of the message, an > > implementation is going to assume a default anyways. Not specifying > > this will make implementations assume a different default value and > > cause unwanted timeouts. > > > > b) Definition of InActivity > > > > There needs to be a discussion of definition of inactivity. If RMS > > sends a sequence to RMD and is waiting for the response which is > > delayed for whatever reason, is that inactivity on the link between > > RMS and RMD counted towards InActivityTimeout? If yes, then it is > > entirely possible that while waiting for a sequence response, RMS > > could timeout due to InActivity. > > > > c) Applicability of InActivityTimeout: > > > > It needs to be specified to which end this parameter is > applicable. It > > seems like sequence creator starts the timer for > InActivityTimeout. If > > the intention is that this timer exists on both ends of a > sender and > > receiver engaged in a RM sequence, we need to define a method for > > synchronization of the timer value of this parameter > between them. For > > example an KeepAlive message would need to be defined for keeping > > sequence alive. > > > > d) Corner Case Handling: > > > > There needs to be a discussion of the corner case when the > > BaseRetransmissionInterval exceeds InActivityTimeout. This > can happen > > when the RMD is indisposed and ExponentialBackoff drives up > the value > > of BaseRetransmissionInterval. In this case my retransmission is > > schedule later than the timeout that I need to abide to. What state > > does the RMS enter in this situation? > > > > e) BaseRetransmissionInterval Needs an Upper Bound: > > > > If an RMD is offline for extended period of time, one can > expect the > > BaseRetransmissionInterval to be exponentially backed off > i.e. become > > large enough to be not meaningful anymore. Having an upper bound on > > this parameter will enable the RMS to stop retransmitting > and report a > > fault. > > > > Proposal: > > > > (revised) > > > > 1) InActivityTimeout and BaseRetransmissionInterval can be > merged into > > one i.e. BaseRetransmissionTimeout. Having just one counter > on the RMS > > and RMD will reduce the run-time resources (much simpler state > > machine) required to implement RM-Assertions and avoid confusion > > (unknown states in state machine) caused by two timeouts. Having a > > separate timeout for sequence and retransmission may not be > necessary > > as activity on the RM link is transmission/retransmission. > I believe > > one timeout i.e. BaseRetransmissionTimeout does not change the > > behavior of the system. Once this timeout occurs the > sequence has to > > timeout as the implication of the timeout is the > destination is either > > congested or offline. > > > > 2) If InActivityTimeout has to be there as a parameter, we need to > > fully specify it with mechanisms for synchronization and > keepalive. In > > addition, we need to discuss how the corner cases and other > conflicts > > that occur when one has two timeout (as discussed in a-e above) are > > handled. > > > > Vikas > > > > Sonoa Systems, Inc. > > > > 3900 Freedom Circle, Suite #101 > > > > Santa Clara, CA 95054 > > > > (408) 748-1730 x100 > > > > > -- > ---------------------------------------------------- > Tom Rutt email: tom@coastin.com; trutt@us.fujitsu.com > Tel: +1 732 801 5744 Fax: +1 732 774 5133 > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]