I
agree with Sanjay on the importance of boxcarring. But in sorting out
the error cases, we should distinguish the semantically significant (=
"related", &-notation) and the semantically insignificant (= "bundled",
+-notation). Both are important and useful, but the error cases are
different.
A
related group would normally need only a single FAULT - as in the case that
started this, which is BEGIN & CONTEXT, with a stale CONTEXT. A bundle would
potentially get interesting mixtures.
It
would certainly be foolish to discard the bundling ability just because there
might be more comprehensive ways of debugging it when it went wrong. Fault cases
mean someone's code or configuration didn't work. The things caught by btp
faults aren't 404 page not exist type things - they nearly all occur when it was
working before, and the communcations still work but something is messed
up. Sounds like a case for the debugger. Rare in production, common in
development.
We can
undoubtedly improve the fault returns, but we shouldn't go overboard on
it.
Peter
------------------------------------------ Peter
Furniss Technical Director, Choreology Ltd web: http://www.choreology.com email:
peter.furniss@choreology.com phone: +44 20 7670 1679 direct: +44 20
7670 1783 mobile: 07951 536168 13 Austin Friars, London EC2N 2JX
Jim,
I
don't think it is a good idea to think of dropping boxcarring. I believe
one-shot is one of the plus points for BTP. In a distributed system if we can
do more work with less number of IOs, the better. It will also help
implementers garner support from customers for using BTP. In case of
inter-enterprise communication using protocols such as HTTPS, opening of a
connection and SSL handshake is very expensive. Although, HTTP
1.1 specifies "persistent" connections, not all
servers in the world implement it and those who do, do so with some
limitations.
I
think we should spend some time in establishing guidelines in handling
failures for compound messages. We don't possibly have to cover all possible
failure cases, we can leave those upto implementers. In my opinion,
advantages far more outweigh disadvantages in this case.
sanjay
Sanjay:
Mark L and I just had a conversation along these lines. We now think
for interoperability reasons that either we invest some significant effort
in getting the semantics of boxcarring properly formed (especially w.r.t.
the fault cases), or else we drop it entirely and leave any possible
transport optimisations to the network.
At
this point I personally could jump either way. Boxcarring will cause us work
if we decide to make a go of it, but on the up-side the Messages message can
be used to improve message throughput in a SOAP server (by parallelising the
message processing) which is cool.
>I suppose one could use a qualifier on FAULT
to point to the message / parameter that caused the
problem.
>>Since we
can potentially compound many different messages and not just
BEGIN/CONTEXT, I would suggest that we need some general mechanism that
allows users to determine exactly where the error
occurred.
I agree on having a general mechanism to
determine where and what exactly happened in case of faults with compound
messages. Reference mechanism will be needed especially for FAULTs in
compound messages.
Also, do we describe what should happen
if fault occurs for 2nd out of 3-related-message group compound
message? Should actions of the whole group be reversed or only of the
failed message? Compound messages are processed in order. That means,
if 2nd message fails, 3rd message is never processed. However, what
about the effects of the 1st message? Should it be
reversed? This is even difficult if an application message is present
in that group. Is the effect of compound message group atomic? I think
not. So, it seems failures in Compound Messages deserve a paragraph
or two in "Failure Recovery" section to address all such
questions.
sanjay
I guess that the spec is not very heavy at
present on error reporting - it takes a relatively light, but probably
adequate, touch.
I think as far as error reporting is concerned
"adequate" is not good enough in a distributed system. It's hard enough
to track down problems locally without having to factor in
remoteness.
I think that what you are highlighting
is actually a problem when you have more than one BTP message
travelling together - which one does a FAULT message in reply refer
to? I suppose one could use a qualifier on FAULT to point to the
message / parameter that caused the problem.
With regards to BEGIN, would you like to
propose some text for us to consider?
Since we can potentially compound many
different messages and not just BEGIN/CONTEXT, I would suggest that we
need some general mechanism that allows users to determine exactly where
the error occurred. However, specifically for BEGIN I'd suggest adding
text along the lines of "If BEGIN is accompanied by a CONTEXT then the
additional FAULT of WrongState may be returned."
Mark.
---------------------------------------------- Dr.
Mark Little, Distinguished Engineer, Transactions Architect, HP
Arjuna Labs Email: mark_little@hp.com Phone: +44
191 2606216 Fax : +44 191 2606250
|