[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Heuristics in BTP atoms
Sazi, just a few comments on your (essentially 3-phase) protocol
extensions:
First of all I have to say that I'm against making any more
significant modifications to the protocol at this final stage that aren't
strictly necessary. We should all realise that no single protocol can ever be
the solution to all of the worlds problems. If we were to try to make BTP do
this then it would become a bloated protocol that no-one would ever use. The
best we can really do is try for the 80% case, whereby we have a protocol that
works perfectly for 80% of the applications that want to use it, and either
doesn't work, or works less efficiently for the other 20%. I'd be satisifed with
this. In my opinion BTP already does this.
I'd also like to point out that way back at the start of this I
suggested a three-phase protocol had some merits for *some* applications, but
not all. Then was the time to discuss it in more detail if people really felt
strongly about it (I didn't), not now.
So what you're saying is that a participant that confirms shouldn't
really do the work until it has received this other message? So, for example, I
shouldn't really dispatch my books to the purchaser until I find out from the
coordinator whether or not the insurance was actually able to confirm as well. I
can see applications where this might be a good idea. However, how long do I
wait for this "actual confirm" message to happen? What if I can't actually
confirm when it turns up? Add a fourth message? Why can't the application
programmer, or service provider, simply implement for this, and use compensation
if that's the case? In this situation, for example, the bookshop could have a
"insurance failed to complete" method that is invoked at the *application* level
within *another* atom/cohesion, that either stops the books, uses an insurance
company associated with the bookshop, or something else that might make
application sense.
First of all that's for the application to sort out. If I want to
have these kind of guarantees then I should probably be looking at ACID
transactions without heuristics, rather than BTP. One protocol for one job, not
one protocol for all jobs.
Secondly, how am I as a service provider supposed to program now? I
get a confirm, and it's not really a confirm. In fact, it's very similar to a
prepare because I can't do any real work on the basis of its reception until I
get this third message. How long do I wait? Do I keep the resources
blocked/locked until I get this third phase message? It's starting to look like
ACID again, so what are the benefits to me from using this rather than, say, an
OTS implementation layered on SOAP? It's starting to look hideously complicated
as a protocol now, so perhaps I won't bother using BTP at all. Web Service users
expect things to be simple: we shouldn't disappoint them by making BTP more
complicated than it really needs to be.
Well, in fact we have to make a distinction between a participant
that definitely cancelled, and one which had failed by the time the confirm
message came along.
Well BEA wants to put that statement into their product
documentation then that's a company matter. However, I certainly won't be
encouraging HP or anyone else to tell programmers that they should ignore
CONTRADICTION messages. It's the same as telling them to ignore heuristics in
CICS, OTS, ... Not a good idea if you want to even attempt to maintain
consistency. They are hard things to resolve, that's true, but simply ignoring
them is looking for trouble.
And this will either be dealt with by an application specific
compensation (e.g., a workflow style), or even at the physical lever when the
books wait at the warehouse for a shipper to turn up and no one does. I'm not
saying that a three-phase protocol isn't useful in *some* situations; only that
it isn't required in the majority and we shouldn't consider it for this round of
BTP.
It's not so that the cancelling participant can re-consider,
because in all likelihood it won't be able to. It's more so that some
administration system/person can use this for a number of reasons, e.g., look at
why the participant "failed", see if some compensation can be fired off
transparently, ...
But the confirmed participant hasn't contradicted the decision. It
has done *exactly* what the coordinator asked of it. It's like saying that in an
OTS implementation a Resource that throws a heuristic exception from commit
shouldn't be told to forget (rough equivalent of CONTRADICTION), but all of the
other "committed" participants should be. They (and their BTP equivalents) have
finished. They may well have gone away and tied up. There may be no end-point
for them anymore. I don't believe CONTRADICTIONS are going to happen that often,
so I as a service implementer don't want to have to program compensations into
*every single* resource I write just on the off chance that it may be
needed.
So the coordinator sees nothing of this? What happens if the
"committed" participants can't uncommit? It seems like you're trying to make the
entire protocol atomic by removing the possibility of heuristics. Unfortunately
this isn't possible unless we tell implementers that they aren't allowed to
produce them, i.e., if they have a resource that says it will prepare, then it
*must* prepare, no matter how long it takes for the final commit message to come
in. A participant isn't allowed to make a unilateral decision at all.
That's certainly one protocol that some applications would find
useful. It's not, however, a protocol that HP would be interested in supporting
for Web Services, since it is no different from using true ACID
transactions.
But what is your definition of "long time". The logs you refer to
are optional, and we make no call as to how long they have to be maintained for
anyway.
Yes, but you are extending this to require participants who make
decisions *only* at the behest of the coordinator to also keep their logs. That
is a different scenario, and one which I would not want to see. The performance
penalty of this is quite signification. No longer can a committed participant
"simply" commit it's work, it now also needs to update a log to say it has done
so, even though the coordinator knows this by virtue of the CONFIRMED message
that the participant is going to send to the coordinator. That's at least two
disk writes and syncs, compared to only one.
Firstly participants don't have to qualify their prepared message
with any timeout, so it's quite valid for a participant to decide to never
unilaterally take a decision. In fact, lots of participants could well do the
same thing, and such services may well want to publish this kind of qualifier
in, say, a UDDI service. That way, clients who never want to end up in a
non-ACID situation (ignoring failures for now) can determine who to talk to
before hand. Now, failures do occur, and it's just possible for one of these
participants to find that despite its best efforts it still can't confirm even
if it wants to, e.g., the disk has failed catastrophically. So, heuristics are
still possible, but we've narrowed this "window of vulnerability"
somewhat.
<Original email deleted.>
No one can ever guarantee to do this, no matter how many messages
we use, or rounds of protocol we have. Failures of media, business logic, or
whatever, can still happen and prevent "committed" participants from undoing, or
"cancelled" participants from committing. Let's deal with these situations at
the application level, or charter a new working group to resolve this in a
domain specific manner. The easier we make it for people to use BTP,
the quicker its take-up will be. A bloated protocol isn't the way to
go.
Mark.
----------------------------------------------
Dr. Mark Little Transactions Architect, HP Arjuna Labs Email: mark@arjuna.com | mark_little@hp.com Phone: +44 191 2064538 Fax : +44 191 2064203 |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC