[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: [ebxml-cppa] Re: Sachs 10/14/2002: [ebxml-msg] Reliable Messaging
Dear Marty, Monica and others, I agree that reliable messaging problems need to be addressed in messaging if at all possible - but then users of messaging should be aware of any 'quirks' and may need to address those in the protocols that use messaging, if they are `deemed to be important. May I also gently point out that a transaction protocol such as BTP can be used to circumvent this problem for users. BTP does not rely on reliable messaging. It does have sufficient complexity and includes all the procedures, including logging (for protocol purposes) to provide for reliable processing of messages (not just reliable delivery), including recovery after system failures such as postulated below. If the protocol does fail (and all do under some extreme conditions) then it has a good go at signalling the failure to process the message correctly. Best Regards Tony A M Fletcher Cohesions 1.0 (TM) Business transaction management software for application coordination Choreology Ltd., 13 Austin Friars, London EC2N 2JX UK Tel: +44 (0) 20 76701787 Mobile: +44 (0) 7801 948219 tony.fletcher@choreology.com <mailto:tony.fletcher@choreology.com> (Home: amfletcher@iee.org) -----Original Message----- From: Martin W Sachs [mailto:mwsachs@us.ibm.com] Sent: 14 October 2002 18:22 To: ebxml-cppa@lists.oasis-open.org Subject: [ebxml-cppa] Re: Sachs 10/14/2002: [ebxml-msg] Reliable Messaging FYI ************************************************************************ ************* Martin W. Sachs IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com ************************************************************************ ************* ----- Forwarded by Martin W Sachs/Watson/IBM on 10/14/2002 01:21 PM ----- Martin W Sachs To: "Monica Martin" <mmartin@certivo.net> 10/14/2002 01:21 cc: "Monica Martin" <mmartin@certivo.net> PM From: Martin W Sachs/Watson/IBM@IBMUS Subject: Re: Sachs 10/14/2002: [ebxml-msg] Reliable Messaging(Document link: Martin W. Sachs) Monica, If we had a "BSI" specification, the reliable messaging recovery protocol could be specified there since the reliable messaging recovery protocol can be viewed as in the layer above the MSH. Given that we do not have a BSI specification and the reliable messaging protocol is already in the MSG specification, adding the additiional function to handle system failures is best done in the MSG specification. If by "Business Collaboration Protocol", you mean BPSS, I guess you could consider putting the reliable messaging recovery protocol there but then you run into the reality that use of BPSS is recommended but not required. The reliable messaging protocol should be complete in the MSG specification so that it can be used with whatever higher level protocols are present. Regards, Marty ************************************************************************ ************* Martin W. Sachs IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com ************************************************************************ ************* "Monica Martin" <mmartin@certivo. To: Martin W Sachs/Watson/IBM@IBMUS net> cc: "Monica Martin" <mmartin@certivo.net> Subject: Sachs 10/14/2002: [ebxml-msg] Reliable Messaging 10/14/2002 12:17 PM Marty, Does this happen at the application transport level (ebMS) or the higher level business process (and possibly human-to-system)? This may be where a discussion comes in about Business Collaboration Protocol which could handle the state at the BSI/BSV, transaction and collaboration levels. It is not syntax specific, and appears to lie about the application transport of ebMS. Thoughts? Monica -----Original Message----- From: Martin W Sachs [mailto:mwsachs@us.ibm.com] Sent: Monday, October 14, 2002 9:46 AM To: ebxml-msg@lists.oasis-open.org Subject: [ebxml-msg] reliable messaging It has been pointed out to me that ebXML reliable messaging is not reliable under system failure. At least one person who mentioned it considers ebXML messaging to be broken as a result. Here is a scenario: Party A send a message reliably to Party B. Party B's MSH receives and persists the message. Party B's MSH attempts to send the reliable-messaging acknowledgment but Party B's system goes down before the acknowledgment gets on the wire. Party A exhausts its retries and concludes that the message was not delivered. Party B eventually comes up and the destination application processes the persisted message as prescribed in the MSG specification. Parties A and B are now out of sync with respect to that transaction and do not know they are out of sync. Party A believes that the transaction failed. Party B has in fact processed the message that it received from Party A. Reliable messaging has failed to deliver on its promise. The solution to this problem is not trivial and the MSG team needs to give it a lot of thought. At a minimum, the following are needed in the spec: 1. Both parties to the message exchange MUST persist enough state to allow recovery and getting back in sync. Specific state variables must be prescribed. They are at least those variables needed to restore the state of the transaction and conversation after system recovery, such as the conversation ID, CPA Id, service, action, and perhaps other parts of the message header. 2. Timeouts and retries, as prescribed in the MSG spec, are not sufficient to cover system failures since the failure could last a very long time. Instead, if the party that sent the message doesn't receive a reply in a reasonable time, it must be able to send a status query to the other party and keep requesting status periodically until it receives a response. The status query protocol must be defined in the MSG specification. If the appropriate state information is persisted at both ends, when party B comes up, it will receive and respond properly to the status query. The timeouts could be retained in the spec but their main use would be to signal the "attached human" to make a phone call. The MSG team should consider this a work item for version 3. Should the team not wish to solve this problem, at the very least, a caveat should be added to the MSG specification that messaging reliability under conditions of system failure is outside the scope of the MSG team. Regards, Marty ************************************************************************ ************* Martin W. Sachs IBM T. J. Watson Research Center P. O. B. 704 Yorktown Hts, NY 10598 914-784-7287; IBM tie line 863-7287 Notes address: Martin W Sachs/Watson/IBM Internet address: mwsachs @ us.ibm.com ************************************************************************ ************* ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl> ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC