I
agree with Sanjay on the importance of boxcarring. But in sorting out
the error cases, we should distinguish the semantically significant
(= "related", &-notation) and the semantically insignificant (= "bundled",
+-notation). Both are important and useful, but the error cases are
different.
A related group would normally need only a single
FAULT - as in the case that started this, which is BEGIN & CONTEXT, with a
stale CONTEXT. A bundle would potentially get interesting mixtures. [Cho, Pyounguk] I agree. Getting a single FAULT
in case of related messages should make it as easy to keep track of
what's been going on as in case of single message. This is not the case
for bundled messages, and I guess it might be useful to
introduce a mechanism to force the order in which each message is
processed. This would make the boxarring type of message, related or bundled,
agnostic w.r.t. FAULT message
generation.
It
would certainly be foolish to discard the bundling ability just because there
might be more comprehensive ways of debugging it when it went wrong. Fault
cases mean someone's code or configuration didn't work. The things caught
by btp faults aren't 404 page not exist type things - they nearly all occur
when it was working before, and the communcations still work but
something is messed up. Sounds like a case for the debugger. Rare in
production, common in development.
We
can undoubtedly improve the fault returns, but we shouldn't go overboard on
it.
Peter
------------------------------------------ Peter
Furniss Technical Director, Choreology Ltd web: http://www.choreology.com email:
peter.furniss@choreology.com phone: +44 20 7670 1679 direct: +44
20 7670 1783 mobile: 07951 536168 13 Austin Friars, London EC2N
2JX
Jim,
I
don't think it is a good idea to think of dropping boxcarring. I believe
one-shot is one of the plus points for BTP. In a distributed system if we
can do more work with less number of IOs, the better. It will also
help implementers garner support from customers for using BTP. In case
of inter-enterprise communication using protocols such as HTTPS, opening of
a connection and SSL handshake is very expensive. Although, HTTP
1.1 specifies "persistent" connections, not all
servers in the world implement it and those who do, do so with some
limitations.
I
think we should spend some time in establishing guidelines in handling
failures for compound messages. We don't possibly have to cover all possible
failure cases, we can leave those upto implementers. In my opinion,
advantages far more outweigh disadvantages in this case.
sanjay
Sanjay:
Mark L and I just had a conversation along these lines. We now
think for interoperability reasons that either we invest some significant
effort in getting the semantics of boxcarring properly formed (especially
w.r.t. the fault cases), or else we drop it entirely and leave any
possible transport optimisations to the network.
At this point I personally could jump either way. Boxcarring will
cause us work if we decide to make a go of it, but on the up-side the
Messages message can be used to improve message throughput in a SOAP
server (by parallelising the message processing) which is
cool.
>I suppose one could use a qualifier on
FAULT to point to the message / parameter that caused the
problem.
>>Since we
can potentially compound many different messages and not just
BEGIN/CONTEXT, I would suggest that we need some general mechanism that
allows users to determine exactly where the error
occurred.
I agree on having a general mechanism to
determine where and what exactly happened in case of faults with
compound messages. Reference mechanism will be needed especially for
FAULTs in compound messages.
Also, do we describe what should
happen if fault occurs for 2nd out of 3-related-message group
compound message? Should actions of the whole group be reversed or
only of the failed message? Compound messages are processed in
order. That means, if 2nd message fails, 3rd message is never
processed. However, what about the effects of the 1st message?
Should it be reversed? This is even difficult if an application
message is present in that group. Is the effect of compound message
group atomic? I think not. So, it seems failures in Compound
Messages deserve a paragraph or two in "Failure Recovery"
section to address all such
questions.
sanjay
I guess that the spec is not very heavy at
present on error reporting - it takes a relatively light, but
probably adequate, touch.
I think as far as error reporting is
concerned "adequate" is not good enough in a distributed system. It's
hard enough to track down problems locally without having to factor in
remoteness.
I think that what you are
highlighting is actually a problem when you have more than one BTP
message travelling together - which one does a FAULT message in
reply refer to? I suppose one could use a qualifier on FAULT
to point to the message / parameter that caused the
problem.
With regards to BEGIN, would you like to
propose some text for us to consider?
Since we can potentially compound many
different messages and not just BEGIN/CONTEXT, I would suggest that we
need some general mechanism that allows users to determine exactly
where the error occurred. However, specifically for BEGIN I'd suggest
adding text along the lines of "If BEGIN is accompanied by a CONTEXT
then the additional FAULT of WrongState may be
returned."
Mark.
---------------------------------------------- Dr.
Mark Little, Distinguished Engineer, Transactions Architect, HP
Arjuna Labs Email: mark_little@hp.com Phone: +44
191 2606216 Fax : +44 191 2606250
|