ws-rx message

Subject: RE: [ws-rx] Issue i022, RM Assertions
From: "Gilbert Pilz" <Gilbert.Pilz@bea.com>
To: "Patil, Sanjay" <sanjay.patil@sap.com>, <tom@coastin.com>, <vikas@sonoasystems.com>
Date: Wed, 26 Oct 2005 11:45:16 -0700
I agree. I was always a bit weirded out by the idea of the WSDL (which I
consider to be primarily a description of the *service*) specifying
aspects of the clients behavior like retransmission interval etc.

- g

> -----Original Message-----
> From: Patil, Sanjay [mailto:sanjay.patil@sap.com] 
> Sent: Wednesday, October 26, 2005 9:22 AM
> To: Gilbert Pilz; tom@coastin.com; vikas@sonoasystems.com
> Cc: Bob Freund; ws-rx@lists.oasis-open.org
> Subject: RE: [ws-rx] Issue i022, RM Assertions
> 
> 
> If you would allow me to tease this issue a little furter ...
> 
> In the scenario described below, I am not sure if it is 
> correct to say that the protocol has not worked. It seems to 
> have worked in the sense that it did not enter into an 
> inderminate state. Sure it could have worked better by 
> optimizing the parameter values, but that is not the same as 
> the failure of the protocol.
> 
> One might argue that the RMD and RMS can learn and adapt 
> dynamically their parameter values to optimize the protocol 
> behavior. For instance, the RMS in the following scenario 
> might get a sequence terminated message from the RMD and 
> assuming the reason for termination (expiration of 
> inactitvity timeout) is also conveyed, the RMS can deduce a 
> better value for its retransmission interaval parameter for 
> the next seqeuence and this cycle may continue until things 
> get evenly settled. I don't dispute that this is not 
> feasible, but the question I have is - whether such dynamic 
> adjustment is a good idea for a high level reliable messaging 
> protocol. Could we not arrive with certain middle ground 
> solution that would meet the large number of use cases. In 
> that regard, I believe that it is sufficient to have RMD 
> specify its InactivityTimeout and AcknowledgementInterval 
> parameters. With this, RMS can easily infer the appropriate 
> values for its internal parameters (which are not needed to 
> be conveyed to the other side, so we don't have to spec 
> them). Just my 2 cents ...
> 
> Thanks,
> Sanjay
> 
> > -----Original Message-----
> > From: Gilbert Pilz [mailto:Gilbert.Pilz@bea.com]
> > Sent: Tuesday, Oct 25, 2005 20:44 PM
> > To: tom@coastin.com; vikas@sonoasystems.com
> > Cc: Bob Freund; ws-rx@lists.oasis-open.org
> > Subject: RE: [ws-rx] Issue i022, RM Assertions
> > 
> > Its easy to pick some values for which the protocol won't work:
> > 
> > InactivityTimeout = {no value}
> > BaseRetransmissionInterval = {no value}
> > 
> > Since "no value" does not imply any default, its possible 
> that the RMS 
> > will settle on a retransmission interval of 5 seconds while the RMD 
> > decides to use an inactivity timeout of 3 seconds. The 
> protocol will 
> > now work only if there are no lost messages. On the first 
> lost message 
> > the RMS will wait 5 seconds before retransmitting by which time the 
> > RMD will have terminted the sequence due to inactivity.
> > 
> > - g
> > 
> > > -----Original Message-----
> > > From: Tom Rutt [mailto:tom@coastin.com]
> > > Sent: Thursday, October 20, 2005 3:23 PM
> > > To: vikas@sonoasystems.com
> > > Cc: 'Bob Freund'; ws-rx@lists.oasis-open.org
> > > Subject: Re: [ws-rx] Issue i022, RM Assertions
> > > 
> > > Vikas Deolaliker wrote:
> > > 
> > > My comments are inline:
> > > 
> > > > Bob,
> > > >
> > > > We are all pessimistic that is why we are trying to add a
> > layer of
> > > > reliability on top of TCP.
> > > >
> > > TCP is under an http request response. However, if the tcp 
> > > connection goes down before the response is received, http has no 
> > > way to recover. This is where ws Reliable messaging comes 
> into play.
> > > 
> > > > BTW, we are in agreement but looks like it is a violent one.
> > > >
> > > > I agree with you that these parameters are not static and
> > > so assertion
> > > > mechanism is the wrong way to introduce them into a any 
> reliable 
> > > > exchange system. Where they should be introduced is in the
> > > mechanisms
> > > > of the system which are dynamic. So ideally this should be
> > > part of the
> > > > protocol.
> > > >
> > > The protocol works regardless of the parameter values.
> > > 
> > > As long as the rms re-transmits until it gets an ack 
> response for a 
> > > message, the protocol will work.
> > > 
> > > Tom Rutt
> > > 
> > > > If we agree to the above, again I agree with you that two
> > > ends cannot
> > > > declare these parameters statically but continuously adjust
> > > them based
> > > > on traffic pattern. RFC 1323 is a good starting point, but the 
> > > > mechanisms that we borrow from it should be part of the
> > > core protocol.
> > > >
> > > > So I guess I am agreeing with you on everything but I get
> > > the feeling
> > > > people think our views are divergent.
> > > >
> > > > Vikas
> > > >
> > > > 
> > > 
> > 
> ----------------------------------------------------------------------
> > > > --
> > > >
> > > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com]
> > > > *Sent:* Thursday, October 20, 2005 1:42 PM
> > > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org
> > > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > > >
> > > > Vikas,
> > > >
> > > > Are you optimistic or pessimistic?
> > > >
> > > > If you are optimistic, and you expect that messages will
> > usually be
> > > > received and acknowledgements not lost, then you might
> > > consider that
> > > > re-transmissions are part of error recovery.
> > > >
> > > > Examples of optimistic systems might include non-blocking
> > crossbar
> > > > connected multiprocessors, or even what we have come to
> > expect via
> > > > normal high speed wired network connectivity, examples of
> > > pessimistic
> > > > systems included most radio based communications.
> > > >
> > > > A pessimistic system might operate with a high re-try ratio.
> > > >
> > > > It is also true that the optimal parameters may change 
> during the 
> > > > duration of a connection.
> > > >
> > > > The problem is that nobody knows, in the general case,
> > what is the
> > > > nature of the stuff between sender and receiver. This
> > implies that
> > > > what is needed is a mechanism that can cover the ranges of
> > > possibilities.
> > > >
> > > > As far as I know, no static parameter set can cover these 
> > > > eventualities. That is why I propose that a mechanism
> > > similar to that
> > > > which is used in RFC 1323 is indicated. In that mechanism, the 
> > > > parameters are learned through a moving average mechanism
> > based on
> > > > actual measured response timed.
> > > >
> > > > Other systems, such as my cross-bar interconnected
> > multiprocessors,
> > > > would use possibly a hardware assisted mechanism.
> > > >
> > > > Thanks
> > > >
> > > > -bob
> > > >
> > > > 
> > > 
> > 
> ----------------------------------------------------------------------
> > > > --
> > > >
> > > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com]
> > > > *Sent:* Thursday, October 20, 2005 4:13 PM
> > > > *To:* Bob Freund; ws-rx@lists.oasis-open.org
> > > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > > >
> > > > Bob,
> > > >
> > > > I realize it not like the flow control like done in lower
> > layers of
> > > > transports.
> > > >
> > > > But none the less it is a control of flow because when you
> > > implement
> > > > it, it affects exchange of messages between RMS and RMD. 
> > > And when it
> > > > goes wrong, you compromise reliable exchange which is the
> > > purpose of
> > > > this spec.
> > > >
> > > > Vikas
> > > >
> > > > 
> > > 
> > 
> ----------------------------------------------------------------------
> > > > --
> > > >
> > > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com]
> > > > *Sent:* Thursday, October 20, 2005 12:13 PM
> > > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org
> > > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > > >
> > > > Vikas,
> > > >
> > > > I don't think that these parameters have much to do with
> > > traditional
> > > > flow control, just the re-try behaviors of each end.
> > > >
> > > > The delay times and intervals are not interoperability
> > concerns as
> > > > much as they are path and performance optimizations.
> > > >
> > > > Flow control, if I understand you correctly, is something
> > like the
> > > > sdlc mechanism of rr/rnr whereby the sender could ask if
> > > the receiver
> > > > was ready to receive a message or not. This tends to be
> > > part of much
> > > > lower level protocols involving physical transport. The
> > > spec assumes
> > > > that there is some sort of transport underneath that has the 
> > > > responsibility of managing what one might call (to coin a
> > term) the
> > > > link layer or transport layer.
> > > >
> > > > This protocol depends on retry to achieve reliability, but
> > > the timing
> > > > characteristics have the only impact of swamping the
> > channel if too
> > > > short and reducing errpr recovery and thus performance if
> > too long.
> > > >
> > > > Even with negotiated algorithms, it is not predictable what
> > > the error
> > > > rates on the channels might be. Oftentimes, errors are
> > > bursty and what
> > > > works very well under normal circumstances will fail
> > during a burst.
> > > > This what we applied in the discussion leading to RFC1323
> > > >
> > > > No, I believe that the parameters BaseRetransmission, 
> > > > ExponentialBackoff and AcknowledgementInterval all need to
> > > be removed,
> > > > but a discussion of retransmission should be added to the
> > > base spec,
> > > > otherwise, we don't have a reliability protocol.
> > > >
> > > > Thanks
> > > >
> > > > -bob
> > > >
> > > > 
> > > 
> > 
> ----------------------------------------------------------------------
> > > > --
> > > >
> > > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com]
> > > > *Sent:* Thursday, October 20, 2005 1:58 PM
> > > > *To:* Bob Freund; ws-rx@lists.oasis-open.org
> > > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > > >
> > > > Bob,
> > > >
> > > > I agree with you in so far as implementers will implement
> > > flow control
> > > > in ways suitable for them. Ideally, what is needed is a
> > > mechanism for
> > > > the RMS and RMD to negotiate and agree upon a flow control
> > > algorithm.
> > > > Part of this negotiation would entail exchange of schema
> > related to
> > > > parameters necessary to follow the algorithm. Should such a
> > > mechanism
> > > > be created by this WG, it should then be part of the core
> > reliable
> > > > messaging protocol and not in the assertions model.
> > > >
> > > > Vikas
> > > >
> > > > 
> > > 
> > 
> ----------------------------------------------------------------------
> > > > --
> > > >
> > > > *From:* Bob Freund [mailto:bob.freund@hitachisoftware.com]
> > > > *Sent:* Thursday, September 22, 2005 7:26 AM
> > > > *To:* vikas@sonoasystems.com; ws-rx@lists.oasis-open.org
> > > > *Subject:* RE: [ws-rx] Issue i022, RM Assertions
> > > >
> > > > Retransmission parameters as well as algorithms are
> > problematic for
> > > > the following reasons:
> > > >
> > > > 1) The characteristics of the path from source to 
> destination are 
> > > > often unknown and often are time-variant.
> > > >
> > > > 2) 2) Retransmissions if too frequent cause flooding and
> > potential
> > > > catastrophic degradation if the path is near saturation
> > > >
> > > > 3) The Path may consist of not only transmission means, 
> but also 
> > > > intermediaries with attendant processing delays
> > > >
> > > > 4) Exponential backoff may be implemented many ways,
> > there is more
> > > > than one algorithm any they have different parameters
> > > >
> > > > 5) Backoff algorithm selection may be implementation
> > > specific, what is
> > > > good for cell phones may not be good for cluster
> > > interconnected nodes
> > > >
> > > > 6) I have found no theoretical modeling available of the
> > > case of web
> > > > services cum intermediaries
> > > >
> > > > 7) Most published data concerning the behavior of backoff
> > > algorithms
> > > > examine fairly simple network segment related saturation
> > and do not
> > > > address client, server, let alone intermediary saturation.
> > > >
> > > > 8) Exponential backoff algorithms need a recovery mechanism
> > > for those
> > > > situations where there is a high standard deviation of delay.
> > > >
> > > > 9) TCP/IP experience has shown that efficiencies are
> > > improved with an
> > > > adaptive mechanism as described in TCP Extensions for High
> > > Performance
> > > > (see RFC 1323 RTTM)
> > > >
> > > > Proposal:
> > > >
> > > > Clearly a backoff mechanism is required; however implementation 
> > > > specific needs are not served well by the selection of
> > any specific
> > > > algorithm for all potential implementations of this
> > > specification. It
> > > > is recommended that implementers utilizing IP based
> > > transmission media
> > > > consider the mechanism described in RFC 1323. Delete all 
> > > > re-transmission parameters as described in the
> > specification since
> > > > they are unnecessary and unhelpful should the 
> implementer use an 
> > > > algorithm with a different set of controls.
> > > >
> > > > Thanks
> > > >
> > > > -bob
> > > >
> > > > 
> > > 
> > 
> ----------------------------------------------------------------------
> > > > --
> > > >
> > > > *From:* Vikas Deolaliker [mailto:vikas@sonoasystems.com]
> > > > *Sent:* Thursday, August 04, 2005 9:53 AM
> > > > *To:* ws-rx@lists.oasis-open.org
> > > > *Subject:* [ws-rx] Issue i022, RM Assertions
> > > >
> > > > Description:
> > > >
> > > > (revised)
> > > >
> > > > The RM policy assertions, specifically, InActivityTimeout, 
> > > > BaseRetransmissionInterval and ExponentialBackoff
> > > parameters need to
> > > > be more finely specified.
> > > >
> > > > The following are the areas which need finer specification
> > > >
> > > > a) Default Value for InActivityTimeout,
> > > BaseRetransmissionInterval and
> > > > ExponentialBackoff:
> > > >
> > > > There needs to be a default set for these parameters. 
> > Currently the
> > > > specification says "If omitted, there is no implied 
> value." Since 
> > > > these parameters dictate the delivery of the message, an 
> > > > implementation is going to assume a default anyways. Not
> > specifying
> > > > this will make implementations assume a different default
> > value and
> > > > cause unwanted timeouts.
> > > >
> > > > b) Definition of InActivity
> > > >
> > > > There needs to be a discussion of definition of
> > inactivity. If RMS
> > > > sends a sequence to RMD and is waiting for the response 
> which is 
> > > > delayed for whatever reason, is that inactivity on the
> > link between
> > > > RMS and RMD counted towards InActivityTimeout? If yes, 
> then it is 
> > > > entirely possible that while waiting for a sequence 
> response, RMS 
> > > > could timeout due to InActivity.
> > > >
> > > > c) Applicability of InActivityTimeout:
> > > >
> > > > It needs to be specified to which end this parameter is
> > > applicable. It
> > > > seems like sequence creator starts the timer for
> > > InActivityTimeout. If
> > > > the intention is that this timer exists on both ends of a
> > > sender and
> > > > receiver engaged in a RM sequence, we need to define a 
> method for 
> > > > synchronization of the timer value of this parameter
> > > between them. For
> > > > example an KeepAlive message would need to be defined 
> for keeping 
> > > > sequence alive.
> > > >
> > > > d) Corner Case Handling:
> > > >
> > > > There needs to be a discussion of the corner case when the 
> > > > BaseRetransmissionInterval exceeds InActivityTimeout. This
> > > can happen
> > > > when the RMD is indisposed and ExponentialBackoff drives up
> > > the value
> > > > of BaseRetransmissionInterval. In this case my 
> retransmission is 
> > > > schedule later than the timeout that I need to abide to.
> > What state
> > > > does the RMS enter in this situation?
> > > >
> > > > e) BaseRetransmissionInterval Needs an Upper Bound:
> > > >
> > > > If an RMD is offline for extended period of time, one can
> > > expect the
> > > > BaseRetransmissionInterval to be exponentially backed off
> > > i.e. become
> > > > large enough to be not meaningful anymore. Having an
> > upper bound on
> > > > this parameter will enable the RMS to stop retransmitting
> > > and report a
> > > > fault.
> > > >
> > > > Proposal:
> > > >
> > > > (revised)
> > > >
> > > > 1) InActivityTimeout and BaseRetransmissionInterval can be
> > > merged into
> > > > one i.e. BaseRetransmissionTimeout. Having just one counter
> > > on the RMS
> > > > and RMD will reduce the run-time resources (much simpler state
> > > > machine) required to implement RM-Assertions and avoid 
> confusion 
> > > > (unknown states in state machine) caused by two timeouts.
> > Having a
> > > > separate timeout for sequence and retransmission may not be
> > > necessary
> > > > as activity on the RM link is transmission/retransmission. 
> > > I believe
> > > > one timeout i.e. BaseRetransmissionTimeout does not change the 
> > > > behavior of the system. Once this timeout occurs the
> > > sequence has to
> > > > timeout as the implication of the timeout is the
> > > destination is either
> > > > congested or offline.
> > > >
> > > > 2) If InActivityTimeout has to be there as a parameter,
> > we need to
> > > > fully specify it with mechanisms for synchronization and
> > > keepalive. In
> > > > addition, we need to discuss how the corner cases and other
> > > conflicts
> > > > that occur when one has two timeout (as discussed in a-e
> > above) are
> > > > handled.
> > > >
> > > > Vikas
> > > >
> > > > Sonoa Systems, Inc.
> > > >
> > > > 3900 Freedom Circle, Suite #101
> > > >
> > > > Santa Clara, CA 95054
> > > >
> > > > (408) 748-1730 x100
> > > >
> > > 
> > > 
> > > --
> > > ----------------------------------------------------
> > > Tom Rutt	email: tom@coastin.com; trutt@us.fujitsu.com
> > > Tel: +1 732 801 5744          Fax: +1 732 774 5133
> > > 
> > > 
> > > 
> > 
>
Follow-Ups:
- Re: [ws-rx] Issue i022, RM Assertions
  - From: Anish Karmarkar <Anish.Karmarkar@oracle.com>