RE: [bt-spec] Issue 15 - Negative reply to BEGIN

bt-spec message

Subject: RE: [bt-spec] Issue 15 - Negative reply to BEGIN

From: "Cho, Pyounguk" <PCho@iona.com>

To: 'Peter Furniss' <peter.furniss@choreology.com>,Sanjay Dalal <sanjay@bea.com>,"WEBBER,JIM (HP-UnitedKingdom,ex1)" <jim_webber@hp.com>,bt-spec@lists.oasis-open.org

Date: Fri, 25 Jan 2002 11:16:10 -0800

Title: Message

-----Original Message-----
From: Peter Furniss [mailto:peter.furniss@choreology.com]
Sent: Friday, January 25, 2002 10:16 AM
To: Sanjay Dalal; WEBBER,JIM (HP-UnitedKingdom,ex1); bt-spec@lists.oasis-open.org
Subject: RE: [bt-spec] Issue 15 - Negative reply to BEGIN

I agree with Sanjay on the importance of boxcarring. But in sorting out the error cases, we should distinguish the semantically significant (= "related", &-notation) and the semantically insignificant (= "bundled", +-notation). Both are important and useful, but the error cases are different.

A related group would normally need only a single FAULT - as in the case that started this, which is BEGIN & CONTEXT, with a stale CONTEXT. A bundle would potentially get interesting mixtures.
[Cho, Pyounguk] I agree. Getting a single FAULT in case of related messages should make it as easy to keep track of what's been going on as in case of single message. This is not the case for bundled messages, and I guess it might be useful to introduce a mechanism to force the order in which each message is processed. This would make the boxarring type of message, related or bundled, agnostic w.r.t. FAULT message generation.

It would certainly be foolish to discard the bundling ability just because there might be more comprehensive ways of debugging it when it went wrong. Fault cases mean someone's code or configuration didn't work. The things caught by btp faults aren't 404 page not exist type things - they nearly all occur when it was working before, and the communcations still work but something is messed up. Sounds like a case for the debugger. Rare in production, common in development.

We can undoubtedly improve the fault returns, but we shouldn't go overboard on it.

Peter

------------------------------------------
Peter Furniss
Technical Director, Choreology Ltd
web: http://www.choreology.com
email: peter.furniss@choreology.com
phone: +44 20 7670 1679
direct: +44 20 7670 1783
mobile: 07951 536168
13 Austin Friars, London EC2N 2JX

-----Original Message-----
From: Sanjay Dalal [mailto:sanjay@bea.com]
Sent: 25 January 2002 18:00
To: WEBBER,JIM (HP-UnitedKingdom,ex1); bt-spec@lists.oasis-open.org
Subject: RE: [bt-spec] Issue 15 - Negative reply to BEGIN

Jim,

I don't think it is a good idea to think of dropping boxcarring. I believe one-shot is one of the plus points for BTP. In a distributed system if we can do more work with less number of IOs, the better. It will also help implementers garner support from customers for using BTP. In case of inter-enterprise communication using protocols such as HTTPS, opening of a connection and SSL handshake is very expensive. Although, HTTP 1.1 specifies "persistent" connections, not all servers in the world implement it and those who do, do so with some limitations.

I think we should spend some time in establishing guidelines in handling failures for compound messages. We don't possibly have to cover all possible failure cases, we can leave those upto implementers. In my opinion, advantages far more outweigh disadvantages in this case.

sanjay

-----Original Message-----
From: WEBBER,JIM (HP-UnitedKingdom,ex1) [mailto:jim_webber@hp.com]
Sent: Friday, January 25, 2002 9:35 AM
To: Sanjay Dalal; BTP Specification List (bt-spec@lists.oasis-open.org)
Subject: RE: [bt-spec] Issue 15 - Negative reply to BEGIN

Sanjay:

Mark L and I just had a conversation along these lines. We now think for interoperability reasons that either we invest some significant effort in getting the semantics of boxcarring properly formed (especially w.r.t. the fault cases), or else we drop it entirely and leave any possible transport optimisations to the network.

At this point I personally could jump either way. Boxcarring will cause us work if we decide to make a go of it, but on the up-side the Messages message can be used to improve message throughput in a SOAP server (by parallelising the message processing) which is cool.

Jim
--
Dr. Jim Webber
Hewlett-Packard Arjuna Lab
http://www.arjuna.com

-----Original Message-----
From: Sanjay Dalal [mailto:sanjay@bea.com]
Sent: 25 January 2002 17:28
To: Mark Little; Tony Fletcher; BT - spec
Subject: RE: [bt-spec] Issue 15 - Negative reply to BEGIN

>I suppose one could use a qualifier on FAULT to point to the message / parameter that caused the problem.

>>Since we can potentially compound many different messages and not just BEGIN/CONTEXT, I would suggest that we need some general mechanism that allows users to determine exactly where the error occurred.

I agree on having a general mechanism to determine where and what exactly happened in case of faults with compound messages. Reference mechanism will be needed especially for FAULTs in compound messages.

Also, do we describe what should happen if fault occurs for 2nd out of 3-related-message group compound message? Should actions of the whole group be reversed or only of the failed message? Compound messages are processed in order. That means, if 2nd message fails, 3rd message is never processed. However, what about the effects of the 1st message? Should it be reversed? This is even difficult if an application message is present in that group. Is the effect of compound message group atomic? I think not. So, it seems failures in Compound Messages deserve a paragraph or two in "Failure Recovery" section to address all such questions.

sanjay

-----Original Message-----
From: Mark Little [mailto:mark_little@hp.com]
Sent: Friday, January 25, 2002 4:54 AM
To: Tony Fletcher; BT - spec
Subject: Re: [bt-spec] Issue 15 - Negative reply to BEGIN

I guess that the spec is not very heavy at present on error reporting - it takes a relatively light, but probably adequate, touch.

I think as far as error reporting is concerned "adequate" is not good enough in a distributed system. It's hard enough to track down problems locally without having to factor in remoteness.

I think that what you are highlighting is actually a problem when you have more than one BTP message travelling together - which one does a FAULT message in reply refer to? I suppose one could use a qualifier on FAULT to point to the message / parameter that caused the problem.

With regards to BEGIN, would you like to propose some text for us to consider?

Since we can potentially compound many different messages and not just BEGIN/CONTEXT, I would suggest that we need some general mechanism that allows users to determine exactly where the error occurred. However, specifically for BEGIN I'd suggest adding text along the lines of "If BEGIN is accompanied by a CONTEXT then the additional FAULT of WrongState may be returned."

Mark.

----------------------------------------------
Dr. Mark Little, Distinguished Engineer,
Transactions Architect, HP Arjuna Labs
Email: mark_little@hp.com
Phone: +44 191 2606216
Fax : +44 191 2606250