ebxml-msg message

Subject: Re: T2 Retry with Delivery Receipt
From: christopher ferris <chris.ferris@Sun.COM>
To: Martin W Sachs <mwsachs@us.ibm.com>
Date: Thu, 20 Sep 2001 10:54:44 -0400
+1

TCP barring a MITM attack ensures that there is lossless
transmission of the message between the TCP endpoints
of the message is not "received".

Cheers,

Chris

Martin W Sachs wrote:
> 
> You are still postulating that there are transmission errors that won't be
> caught by either TCP or the underlying physical transport.  You need to
> make a convincing case that the can happen.
> 
> Regards,
> Marty
> 
> *************************************************************************************
> 
> Martin W. Sachs
> IBM T. J. Watson Research Center
> P. O. B. 704
> Yorktown Hts, NY 10598
> 914-784-7287;  IBM tie line 863-7287
> Notes address:  Martin W Sachs/Watson/IBM
> Internet address:  mwsachs @ us.ibm.com
> *************************************************************************************
> 
> "David Fischer" <david@drummondgroup.com> on 09/19/2001 10:43:42 PM
> 
> To:   Martin W Sachs/Watson/IBM@IBMUS
> cc:   "Dan Weinreb" <dlw@exceloncorp.com>, <ebxml-msg@lists.oasis-open.org>
> Subject:  RE: T2 Retry with Delivery Receipt
> 
> Actually NRR is exactly for making sure that the message arrived intact,
> either
> as a protection from transmission failures or from security breaches (e.g.
> man-in-the-middle attack).  It might be pretty bad if I ordered 2 items and
> there was a transmission failure and it got changed to 1,000,002 (actually
> it
> would be more binary than that).  The signature assures the To Party that
> it did
> not change and NRR assures back to the From Party that it did not change
> (round
> trip).
> 
> David Fischer
> Drummond Group.
> 
> -----Original Message-----
> From: Martin W Sachs [mailto:mwsachs@us.ibm.com]
> Sent: Wednesday, September 19, 2001 9:26 PM
> To: David Fischer
> Cc: Dan Weinreb; ebxml-msg@lists.oasis-open.org
> Subject: RE: T2 Retry with Delivery Receipt
> 
> NRR is a non-repudiation function.  It is not intended as a transmission
> error detector.  Someone will have to make a convincing case that TCP is
> not sufficient for detecting transmission errors.
> 
> Regards,
> Marty
> 
> ********************************************************************************
> 
> *****
> 
> Martin W. Sachs
> IBM T. J. Watson Research Center
> P. O. B. 704
> Yorktown Hts, NY 10598
> 914-784-7287;  IBM tie line 863-7287
> Notes address:  Martin W Sachs/Watson/IBM
> Internet address:  mwsachs @ us.ibm.com
> ********************************************************************************
> 
> *****
> 
> David Fischer <david@drummondgroup.com> on 09/19/2001 10:46:21 AM
> 
> To:   Dan Weinreb <dlw@exceloncorp.com>
> cc:   ebxml-msg@lists.oasis-open.org
> Subject:  RE: T2 Retry with Delivery Receipt
> 
> Very good.  I agree with most of it.
> 
> One comment about check-sums.  We already have an transmission
> error-catching
> mechanism called NRR.
> 
> On the whole, I think this is very good.  The point is that there are some
> scenarios which would require a retry.  But I prefer to phrase the question
> differently -- why would an IM *ever* stop a retry from the end?  It is not
> the
> job of an IM to tell the ends what they may or may not send.  Since it is
> an
> easy thing to differentiate between an IM retry and an end retry, its not
> "why
> would we" but rather "why wouldn't we"?
> 
> This may all be moot since built-in problems with multi-hop (like not
> allowing
> end-to-end retries) could force IMs out of the picture.  I'd rather not,
> but . .
> . ?
> 
> Regards,
> 
> David Fischer
> Drummond Group.
> 
> -----Original Message-----
> From: Dan Weinreb [mailto:dlw@exceloncorp.com]
> Sent: Wednesday, September 19, 2001 9:16 AM
> To: david@drummondgroup.com
> Cc: Chris.Ferris@sun.com; ebxml-msg@lists.oasis-open.org
> Subject: Re: T2 Retry with Delivery Receipt
> 
>    Date: Tue, 18 Sep 2001 15:34:58 -0500
>    From: David Fischer <david@drummondgroup.com>
> 
> <rhetoric-mode>
> 
>    "I don't want to" is not a valid reason.  "It's too complicated" is
> almost as
>    bad (how hard is it to concatenate two strings?).  We can allow retries,
> Chris
>    just doesn't want to.  Why?
> 
> The reason is "It wouldn't do any good".
> 
> If the reason the message didn't get through is that the (unreliable)
> transport layer dropped it, the regular ("hop-to-hop") retry mechanism
> exists to deal with that problem.  There is no need to impose a second
> retry mechanism on top of the first one: or, if there is, then there
> is also a need for a third and fourth layer and so on.
> 
> You said:
> 
>    <df>retries do not guarantee success and never will.  The question is
> what to
> do
>    when those failures occur.</df>
> 
> But what are you saying we should do?  You seem to be saying that we
> should retry some more.
> 
> </rhetoric-mode>
> 
> OK, OK, you're not really saying that.  And I don't really believe
> that they don't do any good under any scenarios.  I think the case for
> end-to-end retry should be made by clearly stating the scenarios where
> end-to-end retry adds value that hop-to-hop retry does not.
> 
> Let's consider why retrying the *same* message (same message ID, same
> digital signature, same contents, just as you say, everything the same
> except certain fields that are specific to the hop-to-hop layer of
> communication) *ever* does *any* good.  If it failed the first time,
> why won't it just keep on failing and failing?  I can see two
> categories of reason:
> 
> (1) There are *random* *transient* failures that happen often enough
> to worry about.  Simply trying again has a good chance of succeeding.
> 
> (2) Something in the external environment changes before the retry.
> I think that's what you had mind when you said "it might be manual"
> and "It might be now or after a fix."
> 
> The "unreliable IM" is an example of (1) that isn't handled by
> hop-to-hop retry and would be handled by automatic, right-now
> end-to-end retry.  It's still not clear that a convincing use
> case for this has been presented.
> 
> What are the scenarios in which (2) provides the justification for the
> retry?  David F, you presented some "example use cases", but some of
> them aren't what we need as scenarios, because they are effects rather
> than causes, e.g. "a DFN sent" or "an Error Message sent".  What I
> think of as a "scenario" has to explain why they were sent: what
> actually went wrong in the first place?
> 
> So let me try some scenarios.  I think scenarios break down into two
> categories: those in which the From party gets some kind of negative
> reply, and those where the From party times out.
> 
> Suppose I send a purchase order to Staples and I digitally sign it
> with a private key, and in the ds:keyInfo I send a certificate with
> the corresponding public key, but unfortunately this certificate
> expired a few days ago.  The To Party sees that the certificate has
> expired, so the digital signature is no good, so it rejects the
> message.  Automatic retries are clearly pointless.  The From people
> could transmit a new certificate out-of-band to the To people and tell
> them to force their MSH to use the new certificate on the existing
> message, but this seems kind of implausible for various reasons.  Or
> the From side could obtain a new certificate, and then send the
> message with the new certificate.  But then it's not the same message,
> as defined above.  Should it have the same messageID?  (I don't have
> an answer to this.)
> 
> Suppose Staples changes its address.  I sent a purchase order to
> Staples, and the CPA says to use HTTP to www.staples.com, and upon
> trying that I get an HTTP 404 (no such URL), or even a DNS error
> ("there's no such host name as "www.staples.com").  Automatic retries
> do no good.  But if administrators at the From host install a new CPA,
> then retrying the exact same message could succeed.
> 
> Suppose Staples's MSH machine has run out of disk space and rejects
> the incoming message.  Automatic retries could solve this, by simply
> retrying until ordinary work frees up disk space, or the
> administrators at Staples add a new disk.  On the other hand, the
> hop-to-hop retry mechanism could do that just as well.  But this
> brings up a question as to when retries ought to time out.  You could
> say that knowing when to really give up sometimes requires manual
> intervention or special knowledge; no simple time-interval value in a
> CPA can substitute for intelligent ways of deciding how long to retry.
> You might posit that a retry mechanism operating at the end-to-end
> level is better positioned to allow this kind of intelligence to be
> brought to bear than a hop-to-hop retry mechanism.
> 
> Related scenario: Staples installs a new release of its MSH software,
> the new release has bugs that cause it to wrongly reject messages; we
> retry after Staples goes back to the old release or installs a fix.
> Similarly, an administrator at Staples messes with the configuration
> settings so that our messages are wrongly rejected, etc.
> 
> (The From MSH might have some kind of fancy features allowing
> administrators finer control over retry.  There might be commands like
> "stop retransmitting this message but keep it in the MSH so that we
> can commence retranmsitting later".  None of this would be part of the
> normative protocol specification.  David F, I get the impression that
> have in mind something like this.)
> 
> Then there are timeout scenarios, e.g. what you called "lack of DR".
> Chris said "If the DR is sent reliably, then its absense is
> significant cause for concern."  I agree, but we still have to figure
> out how to react if a DR does not appear after a "reasonable timeout".
> What scenarios might produce this?  Actually, we don't really need a
> "scenario" as such.  Reliable messaging still allows for the
> possibility that the sender still (after any given time interval) does
> not know whether the message has actually been delivered yet.  So a DR
> can take longer than any "reasonable timeout" even if there has been
> no failure.  If the From side wants to learn whether the message was
> ever recieved, it can either just keep waiting, or it can send a
> message, which might be exactly the same as the original message, or
> might be a Message Status Request.
> 
> You mentioned "XML text corruption in transit".  If we are really
> concerned about data corruption that's not caught by the TCP checksum,
> then we really need to add an error-correcting-code as part of our own
> protocol.  If we don't add one, then we're clearly operating under the
> assumption that the transport layer can be trusted to never deliver
> corrupted data.  (Our failure model for the transport layer is that
> it's "unreliable" in the sense that it can drop messages, but it
> always detects data corruption and discards such messages, so it never
> delivers us corrupted bits.)
> 
> -- Dan
> 
> ----------------------------------------------------------------
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.oasis-open.org/ob/adm.pl>
> 
> ----------------------------------------------------------------
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.oasis-open.org/ob/adm.pl>
> 
> ----------------------------------------------------------------
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.oasis-open.org/ob/adm.pl>
References:
- RE: T2 Retry with Delivery Receipt
  - From: Martin W Sachs <mwsachs@us.ibm.com>