[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Guaranteed duplicate elimination vs. upper bound on delays
Comments below. Regards, Marty ************************************************************************************* Martin W. Sachs IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com ************************************************************************************* Dan Weinreb <dlw@exceloncorp.com> on 08/15/2001 01:28:24 AM Please respond to "Dan Weinreb" <dlw@exceloncorp.com> To: Martin W Sachs/Watson/IBM@IBMUS cc: ebxml-msg@lists.oasis-open.org Subject: Re: Guaranteed duplicate elimination vs. upper bound on delays Date: Mon, 13 Aug 2001 16:38:12 -0400 From: Martin W Sachs <mwsachs@us.ibm.com> As I recall, there is a time to live associated with each IP packet, which helps TCP manage these things. I agree that a message service time to live would help by killing the really long-delayed messages. MWS: I agree with the paragraphs above and below. But the basic idea is the same. Get rid of the stuff that has been hanging around too long before it causes trouble. Actually I think that in real life, the time-to-live field in the IP packet isn't really used as a measure of realtime, but as a hop count, decremented by each router, mainly in order to get rid of looping packages that can arise during unusual circumstances such as when network traffic is proceeding while the routers are changing their configurations/tables/etc. The story on how TCP does this, unfortunately, appears to be complicated. See http://www.lcg.org/sock-faq/detail.php3?id=13, and also RFC 1337 and especially the the Appendix to RFC 1185 (search for "The scheme finally adopted for TCP combines features of both these proposals. TCP uses three mechanisms:"). The key thing seems to be the TIME_WAIT state. The following quotation is from the sock-faq, from Richard Stevens, who, as they say, "wrote the book" on TCP (several books actually): The reason that the duration of the TIME_WAIT state is 2*MSL is that the maximum amount of time a packet can wander around a network is assumed to be MSL seconds. The factor of 2 is for the round-trip. The recommended value for MSL is 120 seconds, but Berkeley-derived implementations normally use 30 seconds instead. This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x does indeed use the recommended MSL of 120 seconds. As far as I can tell the 120 seconds is arbitrary and has nothing to do with the IP time-to-live feature. Unfortunately for us, we are not dealing with IP routers but potentially with store-and-forward mailers, which might accept and store a message, and then suffer a head crash requiring spare parts that might not arrive for weeks, especially if the poor high-tech company is on credit hold and the MIS guy is on vacation, and then it might come back up and finally forward the message a month later. MWS: That means that if the application cares, it also needs a time to live. That should be a parameter of the BPSS spec if it isn't already. Including time-to-live (persist duration) helps be allowing the MSH to clear out the junk that arrive too late for MSH duplicate detection earlier but I'm beginning to think that if that's its main purpose, persistDuration must be supplied by the application -- Dan
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC