OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

ws-rx message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [ws-rx] Issue i022, RM Assertions



If you would allow me to tease this issue a little furter ...

In the scenario described below, I am not sure if it is correct to say
that the protocol has not worked. It seems to have worked in the sense
that it did not enter into an inderminate state. Sure it could have
worked better by optimizing the parameter values, but that is not the
same as the failure of the protocol.

One might argue that the RMD and RMS can learn and adapt dynamically
their parameter values to optimize the protocol behavior. For instance,
the RMS in the following scenario might get a sequence terminated
message from the RMD and assuming the reason for termination (expiration
of inactitvity timeout) is also conveyed, the RMS can deduce a better
value for its retransmission interaval parameter for the next seqeuence
and this cycle may continue until things get evenly settled. I don't
dispute that this is not feasible, but the question I have is - whether
such dynamic adjustment is a good idea for a high level reliable
messaging protocol. Could we not arrive with certain middle ground
solution that would meet the large number of use cases. In that regard,
I believe that it is sufficient to have RMD specify its
InactivityTimeout and AcknowledgementInterval parameters. With this, RMS
can easily infer the appropriate values for its internal parameters
(which are not needed to be conveyed to the other side, so we don't have
to spec them). Just my 2 cents ...

Thanks,
Sanjay

> -----Original Message-----
> From: Gilbert Pilz [mailto:Gilbert.Pilz@bea.com] 
> Sent: Tuesday, Oct 25, 2005 20:44 PM
> To: tom@coastin.com; vikas@sonoasystems.com
> Cc: Bob Freund; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] Issue i022, RM Assertions
> 
> Its easy to pick some values for which the protocol won't work:
> 
> InactivityTimeout = {no value}
> BaseRetransmissionInterval = {no value}
> 
> Since "no value" does not imply any default, its possible that the RMS
> will settle on a retransmission interval of 5 seconds while the RMD
> decides to use an inactivity timeout of 3 seconds. The 
> protocol will now
> work only if there are no lost messages. On the first lost message the
> RMS will wait 5 seconds before retransmitting by which time 
> the RMD will
> have terminted the sequence due to inactivity.
> 
> - g
> 
> > -----Original Message-----
> > From: Tom Rutt [mailto:tom@coastin.com] 
> > Sent: Thursday, October 20, 2005 3:23 PM
> > To: vikas@sonoasystems.com
> > Cc: 'Bob Freund'; ws-rx@lists.oasis-open.org
> > Subject: Re: [ws-rx] Issue i022, RM Assertions
> > 
> > Vikas Deolaliker wrote:
> > 
> > My comments are inline:
> > 
> > > Bob,
> > >
> > > We are all pessimistic that is why we are trying to add a 
> layer of 
> > > reliability on top of TCP.
> > >
> > TCP is under an http request response. However, if the tcp 
> > connection goes down before the response is received, http 
> > has no way to recover. This is where ws Reliable messaging 
> > comes into play.
> > 
> > > BTW, we are in agreement but looks like it is a violent one.
> > >
> > > I agree with you that these parameters are not static and 
> > so assertion 
> > > mechanism is the wrong way to introduce them into a any reliable 
> > > exchange system. Where they should be introduced is in the 
> > mechanisms 
> > > of the system which are dynamic. So ideally this should be 
> > part of the 
> > > protocol.
> > >
> > The protocol works regardless of the parameter values.
> > 
> > As long as the rms re-transmits until it gets an ack response 
> > for a message, the protocol will work.
> > 
> > Tom Rutt
> > 
> > > If we agree to the above, again I agree with you that two 
> > ends cannot 
> > > declare these parameters statically but continuously adjust 
> > them based 
> > > on traffic pattern. RFC 1323 is a good starting point, but the 
> > > mechanisms that we borrow from it should be part of the 
> > core protocol.
> > >
> > > So I guess I am agreeing with you on everything but I get 
> > the feeling 
> > > people think our views are divergent.
> > >
> > > Vikas
> > >
> > > 
> > 
> ----------------------------------------------------------------------
> > > --
> > >
> > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com]
> > > *Sent:* Thursday, October 20, 2005 1:42 PM
> > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org
> > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > >
> > > Vikas,
> > >
> > > Are you optimistic or pessimistic?
> > >
> > > If you are optimistic, and you expect that messages will 
> usually be 
> > > received and acknowledgements not lost, then you might 
> > consider that 
> > > re-transmissions are part of error recovery.
> > >
> > > Examples of optimistic systems might include non-blocking 
> crossbar 
> > > connected multiprocessors, or even what we have come to 
> expect via 
> > > normal high speed wired network connectivity, examples of 
> > pessimistic 
> > > systems included most radio based communications.
> > >
> > > A pessimistic system might operate with a high re-try ratio.
> > >
> > > It is also true that the optimal parameters may change during the 
> > > duration of a connection.
> > >
> > > The problem is that nobody knows, in the general case, 
> what is the 
> > > nature of the stuff between sender and receiver. This 
> implies that 
> > > what is needed is a mechanism that can cover the ranges of 
> > possibilities.
> > >
> > > As far as I know, no static parameter set can cover these 
> > > eventualities. That is why I propose that a mechanism 
> > similar to that 
> > > which is used in RFC 1323 is indicated. In that mechanism, the 
> > > parameters are learned through a moving average mechanism 
> based on 
> > > actual measured response timed.
> > >
> > > Other systems, such as my cross-bar interconnected 
> multiprocessors, 
> > > would use possibly a hardware assisted mechanism.
> > >
> > > Thanks
> > >
> > > -bob
> > >
> > > 
> > 
> ----------------------------------------------------------------------
> > > --
> > >
> > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com]
> > > *Sent:* Thursday, October 20, 2005 4:13 PM
> > > *To:* Bob Freund; ws-rx@lists.oasis-open.org
> > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > >
> > > Bob,
> > >
> > > I realize it not like the flow control like done in lower 
> layers of 
> > > transports.
> > >
> > > But none the less it is a control of flow because when you 
> > implement 
> > > it, it affects exchange of messages between RMS and RMD. 
> > And when it 
> > > goes wrong, you compromise reliable exchange which is the 
> > purpose of 
> > > this spec.
> > >
> > > Vikas
> > >
> > > 
> > 
> ----------------------------------------------------------------------
> > > --
> > >
> > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com]
> > > *Sent:* Thursday, October 20, 2005 12:13 PM
> > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org
> > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > >
> > > Vikas,
> > >
> > > I don't think that these parameters have much to do with 
> > traditional 
> > > flow control, just the re-try behaviors of each end.
> > >
> > > The delay times and intervals are not interoperability 
> concerns as 
> > > much as they are path and performance optimizations.
> > >
> > > Flow control, if I understand you correctly, is something 
> like the 
> > > sdlc mechanism of rr/rnr whereby the sender could ask if 
> > the receiver 
> > > was ready to receive a message or not. This tends to be 
> > part of much 
> > > lower level protocols involving physical transport. The 
> > spec assumes 
> > > that there is some sort of transport underneath that has the 
> > > responsibility of managing what one might call (to coin a 
> term) the 
> > > link layer or transport layer.
> > >
> > > This protocol depends on retry to achieve reliability, but 
> > the timing 
> > > characteristics have the only impact of swamping the 
> channel if too 
> > > short and reducing errpr recovery and thus performance if 
> too long.
> > >
> > > Even with negotiated algorithms, it is not predictable what 
> > the error 
> > > rates on the channels might be. Oftentimes, errors are 
> > bursty and what 
> > > works very well under normal circumstances will fail 
> during a burst.
> > > This what we applied in the discussion leading to RFC1323
> > >
> > > No, I believe that the parameters BaseRetransmission, 
> > > ExponentialBackoff and AcknowledgementInterval all need to 
> > be removed, 
> > > but a discussion of retransmission should be added to the 
> > base spec, 
> > > otherwise, we don't have a reliability protocol.
> > >
> > > Thanks
> > >
> > > -bob
> > >
> > > 
> > 
> ----------------------------------------------------------------------
> > > --
> > >
> > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com]
> > > *Sent:* Thursday, October 20, 2005 1:58 PM
> > > *To:* Bob Freund; ws-rx@lists.oasis-open.org
> > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > >
> > > Bob,
> > >
> > > I agree with you in so far as implementers will implement 
> > flow control 
> > > in ways suitable for them. Ideally, what is needed is a 
> > mechanism for 
> > > the RMS and RMD to negotiate and agree upon a flow control 
> > algorithm.
> > > Part of this negotiation would entail exchange of schema 
> related to 
> > > parameters necessary to follow the algorithm. Should such a 
> > mechanism 
> > > be created by this WG, it should then be part of the core 
> reliable 
> > > messaging protocol and not in the assertions model.
> > >
> > > Vikas
> > >
> > > 
> > 
> ----------------------------------------------------------------------
> > > --
> > >
> > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com]
> > > *Sent:* Thursday, September 22, 2005 7:26 AM
> > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org
> > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > >
> > > Retransmission parameters as well as algorithms are 
> problematic for 
> > > the following reasons:
> > >
> > > 1) The characteristics of the path from source to destination are 
> > > often unknown and often are time-variant.
> > >
> > > 2) 2) Retransmissions if too frequent cause flooding and 
> potential 
> > > catastrophic degradation if the path is near saturation
> > >
> > > 3) The Path may consist of not only transmission means, but also 
> > > intermediaries with attendant processing delays
> > >
> > > 4) Exponential backoff may be implemented many ways, 
> there is more 
> > > than one algorithm any they have different parameters
> > >
> > > 5) Backoff algorithm selection may be implementation 
> > specific, what is 
> > > good for cell phones may not be good for cluster 
> > interconnected nodes
> > >
> > > 6) I have found no theoretical modeling available of the 
> > case of web 
> > > services cum intermediaries
> > >
> > > 7) Most published data concerning the behavior of backoff 
> > algorithms 
> > > examine fairly simple network segment related saturation 
> and do not 
> > > address client, server, let alone intermediary saturation.
> > >
> > > 8) Exponential backoff algorithms need a recovery mechanism 
> > for those 
> > > situations where there is a high standard deviation of delay.
> > >
> > > 9) TCP/IP experience has shown that efficiencies are 
> > improved with an 
> > > adaptive mechanism as described in TCP Extensions for High 
> > Performance 
> > > (see RFC 1323 RTTM)
> > >
> > > Proposal:
> > >
> > > Clearly a backoff mechanism is required; however implementation 
> > > specific needs are not served well by the selection of 
> any specific 
> > > algorithm for all potential implementations of this 
> > specification. It 
> > > is recommended that implementers utilizing IP based 
> > transmission media 
> > > consider the mechanism described in RFC 1323. Delete all 
> > > re-transmission parameters as described in the 
> specification since 
> > > they are unnecessary and unhelpful should the implementer use an 
> > > algorithm with a different set of controls.
> > >
> > > Thanks
> > >
> > > -bob
> > >
> > > 
> > 
> ----------------------------------------------------------------------
> > > --
> > >
> > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com]
> > > *Sent:* Thursday, August 04, 2005 9:53 AM
> > > *To:* ws-rx@lists.oasis-open.org
> > > *Subject:* [ws-rx] Issue i022, RM Assertions
> > >
> > > Description:
> > >
> > > (revised)
> > >
> > > The RM policy assertions, specifically, InActivityTimeout, 
> > > BaseRetransmissionInterval and ExponentialBackoff 
> > parameters need to 
> > > be more finely specified.
> > >
> > > The following are the areas which need finer specification
> > >
> > > a) Default Value for InActivityTimeout, 
> > BaseRetransmissionInterval and
> > > ExponentialBackoff:
> > >
> > > There needs to be a default set for these parameters. 
> Currently the 
> > > specification says "If omitted, there is no implied value." Since 
> > > these parameters dictate the delivery of the message, an 
> > > implementation is going to assume a default anyways. Not 
> specifying 
> > > this will make implementations assume a different default 
> value and 
> > > cause unwanted timeouts.
> > >
> > > b) Definition of InActivity
> > >
> > > There needs to be a discussion of definition of 
> inactivity. If RMS 
> > > sends a sequence to RMD and is waiting for the response which is 
> > > delayed for whatever reason, is that inactivity on the 
> link between 
> > > RMS and RMD counted towards InActivityTimeout? If yes, then it is 
> > > entirely possible that while waiting for a sequence response, RMS 
> > > could timeout due to InActivity.
> > >
> > > c) Applicability of InActivityTimeout:
> > >
> > > It needs to be specified to which end this parameter is 
> > applicable. It 
> > > seems like sequence creator starts the timer for 
> > InActivityTimeout. If 
> > > the intention is that this timer exists on both ends of a 
> > sender and 
> > > receiver engaged in a RM sequence, we need to define a method for 
> > > synchronization of the timer value of this parameter 
> > between them. For 
> > > example an KeepAlive message would need to be defined for keeping 
> > > sequence alive.
> > >
> > > d) Corner Case Handling:
> > >
> > > There needs to be a discussion of the corner case when the 
> > > BaseRetransmissionInterval exceeds InActivityTimeout. This 
> > can happen 
> > > when the RMD is indisposed and ExponentialBackoff drives up 
> > the value 
> > > of BaseRetransmissionInterval. In this case my retransmission is 
> > > schedule later than the timeout that I need to abide to. 
> What state 
> > > does the RMS enter in this situation?
> > >
> > > e) BaseRetransmissionInterval Needs an Upper Bound:
> > >
> > > If an RMD is offline for extended period of time, one can 
> > expect the 
> > > BaseRetransmissionInterval to be exponentially backed off 
> > i.e. become 
> > > large enough to be not meaningful anymore. Having an 
> upper bound on 
> > > this parameter will enable the RMS to stop retransmitting 
> > and report a 
> > > fault.
> > >
> > > Proposal:
> > >
> > > (revised)
> > >
> > > 1) InActivityTimeout and BaseRetransmissionInterval can be 
> > merged into 
> > > one i.e. BaseRetransmissionTimeout. Having just one counter 
> > on the RMS 
> > > and RMD will reduce the run-time resources (much simpler state
> > > machine) required to implement RM-Assertions and avoid confusion 
> > > (unknown states in state machine) caused by two timeouts. 
> Having a 
> > > separate timeout for sequence and retransmission may not be 
> > necessary 
> > > as activity on the RM link is transmission/retransmission. 
> > I believe 
> > > one timeout i.e. BaseRetransmissionTimeout does not change the 
> > > behavior of the system. Once this timeout occurs the 
> > sequence has to 
> > > timeout as the implication of the timeout is the 
> > destination is either 
> > > congested or offline.
> > >
> > > 2) If InActivityTimeout has to be there as a parameter, 
> we need to 
> > > fully specify it with mechanisms for synchronization and 
> > keepalive. In 
> > > addition, we need to discuss how the corner cases and other 
> > conflicts 
> > > that occur when one has two timeout (as discussed in a-e 
> above) are 
> > > handled.
> > >
> > > Vikas
> > >
> > > Sonoa Systems, Inc.
> > >
> > > 3900 Freedom Circle, Suite #101
> > >
> > > Santa Clara, CA 95054
> > >
> > > (408) 748-1730 x100
> > >
> > 
> > 
> > --
> > ----------------------------------------------------
> > Tom Rutt	email: tom@coastin.com; trutt@us.fujitsu.com
> > Tel: +1 732 801 5744          Fax: +1 732 774 5133
> > 
> > 
> > 
> 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]