OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

business-transaction message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: Heuristics in BTP atoms


 
It is difficult to be against this argument, I guess question is then to figure out whether what I suggest is really a significant modification or not. What I suggest I belive will improve the protocol but there may be many valid reasons not to consider to include it in the protocol at this time.
If we are talking about adding another message to BTP then this is an imposition on the coordinator and the participant. I'd call that significant. IMO this is more a management level issue: if I want my service/participant to be informed of these kinds of issues then I can program accordingly, since the contradiction will be know of (eventually) to the user of the coordinator.
I think the reason that I suggest this modification is that I 'feel' that failure case might exceed %20...
I agree that this is all very subjective, and no one has any real figures for Web Services. However, in our experience of using long-running transactions and workflows, this kind of problem does not happen often. Whether we can transpose this experience to Web Services is certainly an open issue that can only be solved by time.
and I think with the suggested modification  we may reduce the failure situations.
It won't help the failures, because unless I have written my participant to expect this message, it will already have committed, and possibly gone away or dispatched the books. Not every book shop, for example, will be able to afford to employ a guy on a motorbike who can shoot off and catch the book-dispatch-truck and call it back when they're told that the insurance failed.
 
As a slight aside, you could always order the invocations on participants to reduce the chances of this happening.
 
Again, at this stage  both your and my suggestions are all best guesses based on our experiences and assumptions.
True, but I am also stating that in general this would not solve the problem anyway, no matter how often it occurs.
I think my proposal boil-downs to send a final message to a participant of an atomic transaction that is already confirmed but unaware of the fact that the transaction is not reached a final outcome because some other participant cancelled (whether it is because of a PREPARED time-out or other reasons).
So is this a message that has to be sent? If so, the coordinator now needs to keep persistent the information about all of the participants after the end of the second phase.
 
Why should all confirmed participants receive this message? Are we assuming they will all want to know? This is not the same as sending PREPARE and CONFIRM to them, since we know they all *need* to see these messages. This new message is more informational, since it cannot be guaranteed to have an affect on the outcome of the BTP, unless we say to programmers that they should not confirm on CONFIRM. So, if it's informational, I don't want my coordinator to *have* to send it, so it doesn't need to make this information persistent. I also don't want to have to send it to all participants, and I think programmers of participants may well object to having to deal with it. In which case, it's more optional than anything, and should be dealt with at a higher level than BTP. Management interfaces.
No, I do not suggest that the participant should not do the work until the confirm received at all. I suggest that the participant keeps  the log around until the confirm is received (just like the cancelled participant waits a contradiction message from coordinator) so that it can do compensation or any other way of recovery it may see appropriate when it receives 'not completed' message otherwise it receives 'completed' and can remove the log.
It's different for the cancelled participant to have to wait: it caused the problem in the first place. To impose this burden on confirmed participants (and *all* confirmed participants at that) means that the coordinator changes, the participants change, and the service has to change. This is much more a workflow issue IMO. We should be looking at trying to persuade IBM, MSFT, and even HP to layer their workflow languages on to BTP.
I think it is not difficult to guess that participant confirmed (when transaction failed) will experience more trouble. Again, your assumption is that the participant will wait to do work until a 'completed' message is received, I suggest it will do the work but keep the log until it receives a 'completed' message.
OK, but why can this not be dealt with at the application level? A compensation operation may not be appropriate at all, or may well involve having to talk to all of the participants in a coordinated (no pun intended) operation. One participant does not have sufficient information to do a compensation on behalf of the *entire* cohesion.
 
Compensation is definitely a collaboration between the user and the participants. I'm not saying that what you propose is invalid, only that it should not be dealt with at the BTP level.
Again, you are assuming that the participant will block, do not do the work until it receives 'completed' message. All I suggest is to let the participant do the work, keep the log (on what has been done) until it receives 'completed'' so in case of failure it may 'have a chance' to try to recover, perhaps compensation, perhaps by other means.
Fine, but what you are saying is that BTP should take care of this, and that a single participant knows how to compensate for itself. Compensation for a single participant may well not be sufficient to drive compensation for the entire cohesion. This is why workflow systems work the way they do. Let's not make BTP do everything, but let's try to leverage other Web Service "standards".
I agree that we should not make BTP complicated, but what is suggested is not making so.
Obviously I disagree because of the impact on the coordinator, participants and services, for something which IMO isn't even a 20% case.
Tell me what will a participant do when it received a 'contradiction'? Participant received contradiction because it perhaps timed-out on prepare, perhaps it knowingly issued cancel (remember it is allowed to time-out, and send cancel for what ever reason before receiving a confirm). There is no assumption on what a contradicted participant will do in BTP. BTP do not assume the participant will revise its own decision, beside it most probably sent canceled because it is timed-out - are we not allowing to time-out?
How you implement what do to on reception of a CONTRADICTION message is up to the participant. However, I would advise that you do roughly the same as a transactional resource does for a heuristic: when a participant makes a decision that could be counter to the actual outcome it should record this decision. When the coordinator has determined that the outcome is different, it sends a CONTRADICTION message to the participant (similar to sending an OTS forget, for example). A "heuristic" outcome means that a participant has done something that means the resultant "transaction" isn't atomic, and possibly cannot be undone automatically: participants don't know the semantics of the work that they have just "committed" or "undone" - it's like saying that an OTS Resource that wraps X/Open knows about the SQL statements that were first at the connection. As a result, I'd hope that a participant that generates a heuristic then reports this fact to some administrator. Then it's down to that individual to sort out, possibly looking back through logs.
 
But the participants that did the right thing by the coordinator, don't need to do anything. If the entire activity needs compensation then that's a higher-level issue. (Sorry to sound like a broken record!)
No comments on BEA and HP stuff... BTP will still be fine without my suggestion included,  we all are trying to make BTP better, whether we are agreeing or not in details.
My point is that we should not be telling people to ignore heuristics. If that is the case then why have the message in the first place.
 
Since I do not suggest that (although, perhaps I have used the term 3PC in my previous e-mail..) participant to wait to do work until it receives 'completed' (only suggest to keep logs to use it in case of failure), thus there is no locking involved, it is not a 3PC.. You may call it 2.5PC!
OK, if you are not saying that CONFIRM is a pre-confirm then there need not be any resource blocking. However, there is now the need to maintain logs at the participant, and for the coordinator to run 3 phases.
Ok, this is a good point. At least there is a way (although I think is more complicated than what I suggest) to initiate a recovery and hopefully help the other participants to recover... but then participants are not aware of each others (only coordinator know all the participants!)... so at best this recovery is very cumbersome.
Agreed, which is why to some extent workflow can help.
You may find a better wording for this, in fact I suggest other words instead of contradiction in this situation. So I agree, participant did **exactly** what is asked. But still when the transaction is failed, the participant that did exactly what is asked is in worse shape then the participant that did not do what is asked!
Which is why a workflow could then be used to fire-off another compensation cohesion. BTP should be useable in a recursive manner like this.
I disagree with you on what I suggested is being an ACID transaction protocol at all.
It would be if you were saying that CONFIRM does not mean "do the work in a non-undo-able manner".
Ok, another good point. I agree that there may be some performance penalty (at the end it is another message sent and participant keeping the log)
It's a performance penalty on all participants because obviously a CONFIRM-ed participant doesn't know if a CONTRADICTION will ever happen. So, even if a CONTRADICTION does not happen, it must receive this third phase message (and hence the coordinator must send it). Unless we say that "if you don't receive a message by time X, you can assume a CONTRADICTION didn't happen", but then that's very shaky ground!

Firstly participants don't have to qualify their prepared message with any timeout,

Do not have to, but it is in the protocol thus it may be used.
Yes, but if there is a qualifier than it's a deployment/useage situation and it's up to the participant and user to determine if that is the right thing to do. If I as a user don't want that, then I should look for participants that advertise such.

This is not a domain specific issue. Also as I mentioned earlier, in case of VT, coordinator cannot really tell terminator which participant 'contradicted' which one 'committed'.
It is application specific.

Agreed with its meaning... respectfully disagreed with what might be implied. I do not think my suggestion makes BTP less useful. I agree that it will have some performance penalty, but I think it makes the protocol close to %90 perfectly functional instead of %80 ;)
I didn't say it made BTP less useful. More bloated, perhaps. There are applications that could well use what you suggest. However, it runs the risk of discouraging others from using BTP. I think the trade-off is not worth the risk at this stage. If we find that there is a significant use-case for this extension *after* BTP is in use, then we can re-examine the protocol. However, I suspect there will not be, because WSFL or XLANG, or whatever may be layered on BTP to accomplish the same result with no impact on participants who don't want this.
 
Mark.
 
----------------------------------------------
Dr. Mark Little
Transactions Architect, HP Arjuna Labs
Email: mark@arjuna.com | mark_little@hp.com
Phone: +44 191 2064538
Fax  : +44 191 2064203
 
 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC