OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

ebxml-msg message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Re: [ebxml-msg] reliable messaging


ebMS *does* indeed provide such a status query. Granted that its required use in the
failure mode you articulate is not specified (it could easily be). I do not believe that
the protocol is necessarily broken in this regard, however it could certainly be reinforced
and made more clear.

I should also point out that no matter how hard one tries, it is impossible to close the
loop entirely. If B never recovers, then A and B are permanently and unreconcilably
out of synch w/r/t their shared understanding of the state of the exchange.

Further comments below.


Christopher Ferris
Architect, Emerging e-business Industry Architecture
email: chrisfer@us.ibm.com
phone: +1 508 234 3624

Martin W Sachs/Watson/IBM@IBMUS wrote on 10/14/2002 11:46:01 AM:

> It has been pointed out to me that ebXML reliable messaging is not reliable
> under system failure.  At least one person who mentioned it considers ebXML
> messaging to be broken as a result.  Here is a scenario:
> Party A send a message reliably to Party B.
> Party B's MSH receives and persists the message.
> Party B's MSH attempts to send the reliable-messaging acknowledgment but
> Party B's system goes down before the acknowledgment gets on the wire.
> Party A exhausts its retries and concludes that the message was not
> delivered.
> Party B eventually comes up and the destination application processes the
> persisted message as prescribed in the MSG specification.
> Parties A and B are now out of sync with respect to that transaction and do
> not know they are out of sync. Party A believes that the transaction
> failed. Party B has in fact processed the message that it received from
> Party A. Reliable messaging has failed to deliver on its promise.
> The solution to this problem is not trivial and the MSG team needs to give
> it a lot of thought.  At a minimum, the following are needed in the spec:
> 1.  Both parties to the message exchange MUST persist enough state to allow
> recovery and getting back in sync. Specific state variables must  be

This is already prescribed in the spec.

> prescribed.  They are at least those variables needed to restore the state
> of the transaction and conversation after system recovery, such as the
> conversation ID, CPA Id, service, action, and perhaps other parts of the
> message header.
> 2. Timeouts and retries, as prescribed in the MSG spec, are not sufficient
> to cover system failures since the failure could last a very long time.
> Instead, if the party that sent the message doesn't receive a reply in a
> reasonable time, it must be able to send a status query to the other party
> and keep requesting status periodically until it receives a response.  The
> status query protocol must be defined in the MSG specification. If the

The protocol is defined, see section 7.

> appropriate state information is persisted at both ends, when party B comes
> up, it will receive and respond properly to the status query.  The timeouts
> could be retained in the spec but their main use would be to signal the
> "attached human" to make a phone call.

That is always an option:)

> The MSG team should consider this a work item for version 3. Should the
> team not wish to solve this problem, at the very least, a caveat should be
> added to the MSG specification that messaging reliability under conditions
> of system failure is outside the scope of the MSG team.

Again, I believe that much of your concerns are already addressed. There is no
doubt in my mind that they could be reinforced, making it abundantly clear
to the reader.

> Regards,
> Marty
> *************************************************************************************
> Martin W. Sachs
> IBM T. J. Watson Research Center
> P. O. B. 704
> Yorktown Hts, NY 10598
> 914-784-7287;  IBM tie line 863-7287
> Notes address:  Martin W Sachs/Watson/IBM
> Internet address:  mwsachs @ us.ibm.com
> *************************************************************************************
> ----------------------------------------------------------------
> To subscribe or unsubscribe from this elist use the subscription
> manager: <http://lists.oasis-open.org/ob/adm.pl>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC