OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

business-transaction message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: RE: Heuristics in BTP atoms


I apologize, if this was raised before... Why do we need an extra timeout on the PREPARE?
Why is the timeout for the atom not sufficient? I am not sure that this "I promise to be PREPAREd for the next 5 seconds" is realistic in the world of web services...I assume that most of the web services participating in a BTP transaction will use compensation to back out committed changes, therefore this extra timeout will be irrelevant.
 
Pal
-----Original Message-----
From: Peter Furniss [mailto:peter.furniss@choreology.com]
Sent: Saturday, September 01, 2001 4:23 AM
To: Sazi Temel; Sazi Temel; business-transaction@lists.oasis-open.org
Subject: RE: Heuristics in BTP atoms

A Participant that makes an autonomous decision is required to have a persistent record of this, and will retain that record until either receiving the matching "right" answer, or receiving a CONTRADICTION message.  At the coordinator, the contradiction is certain to be detected, despite failures. We don't absolutely guarantee that it will be known at the participant, as some failure sequences can lose the messages:  (reformat to fixed-pitch font if your mailer doesn't)
 
When a time-out occurs the participant makes a decision on whether it will wait more or it will cancel, does this considered an autonomous decision? If so in this case the participant should retain the log records? What about the participant that is already confirmed, should it also retain the records or it simply forgets the transaction?
 
This is "promise or threat" timeout question - if the timeout goes off, has the participant said it WILL apply its own decision, or just that it reserves the right to do so.  We currently have it that the standard qualifier timeout is a threat, not a promise - it is up to the participant to determine exactly when to make the decision. (However, we do have the default-to-cancel flag as well, which means there is no further exchange if both sides do cancel). It would be essentially unenforceable to require the Participant to make the decision on the dot.
 
So "autonomous decision" means that the cancel (or confirm) decision is made and applied, not just that it is thinking about it. (At the crunch, the timeout is advisory, warning the superior that it should get its decision there before the timeout goes off, but not making any absolute statements - unlike CANCELLED or CONFIRMED which announce what has happened).
 
Either way, the Participant is required to retain log records until it receives a message from the Superior - exactly which message depends on which combination of decisions - if the Inferior decision is in line with the Superior's, then it will be the CONFIRM or CANCEL, if contrary then the CONTRADICTION (except in the case of failures like the example, where it is the SUP_STATE/unknown.
 
 
 
trial 1
superior                                              inferior
I1 :B1  <--------------------------ENROL/no-rsp-req <-- i1 :b1
                              decide to be prepared === b1 :e1
B1 :E1  <----------------------------------PREPARED <-- e1 :e1
E1 :F1  === decide to confirm
                      decide to cancel autonomously === e1 :j1
F1 :F1  --> CONFIRM
F1 :K1  <---------------------------------CANCELLED <-- j1 :j1
K1 :R1  === record contradiction
R1 :R2  --> CONTRADICTION
                                       disruption 0 XXX j1 :j1
                CONFIRM---X
                CONTRADICTION---X
R2 :Z   === remove persistent information
Z  :Y1  <---------------------------------CANCELLED <-- j1 :j1
Y1 :Z   --> SUP_STATE/unknown-------------------------> j1 :j2
                      remove persistent information === j2 :z
  Superior confirms, inferior cancelled - contradiction reported (+!:-)
 
The only way to avoid this would be another round of messages before the superior was allowed to remove the persistent information (e.g. a CONTRADICTED message from the inferior).  But you can more easily avoid it in practice by retaining the superior's record of the contradiction for longer (how long is obviously a management decision - but that R2:Z remove persistent information is a "lazy delete", so it can be postponed as long as you like).
 
Ok.
 
Without that exceptional case, the inferior (Participant) *will* know it made the wrong decision, and could reconsider. As you suggest, the coordinator cannot *require* the reconsideration, since again the resource is owned by someone else.  (In fact, if the original autonomous decision was made for good reason, it would seem unlikely that the decision will be reversed - if it could have waited until the right answer was known, why didn't wait in the first place)
 
I was thinking the usual case where the participant knows that it made "wrong" (actually nothing is wrong it just followed the protocol and canceled after the time-out or some other reasons) decision but mean while the other participant is already confirmed and perhaps removed the logs of this transaction.. 
 
"wrong" meaning contrary to the decision (or above) the Superior - yes, it's ok for the protocol.
 
The superior can't remove its logs until it has the reply back from the inferior.
 
  As you pointed out, since the coordinator cannot assume re-consideration of decisions by the participants and it knows that now the transaction is in a contradicting state (one confirmed, one canceled) and it already informed the participant that sent cancel by sending a CONTRADICTING message, I think it should also send the same message to the participant that is confirmed so that it can take necessary actions (undo, compensate etc.) assuming that the participant retained the logs on this particular transaction.
 
That rather goes against the general assumption that Service/Participants are independent, and linked only at the instigation of the client. If the hotel cancelled and the airline confirmed, I have a problem, but neither of them cares. I'm going to have to do something new at application (or management) level.  Obviously there will be scenarios where it can be useful to tell all the parties that someone made a contrary decision, and they could respond variously, but I'm not sure the circumstances will be sufficiently regular to carry in BTP messages.
 
It looks like BTP (atomic) transaction requires a final-outcome message to be sent to the participants (whether the transactions committed or not - since we do not want the participants wait for a while and assume everything went ok).
 
For the failure case above, the contradiction message is already sent to failing participant. Does the protocol include such a message to be sent to the confirmed participant too? Looks like it is the one that needs such a message... Since participants cannot assume that if they do not get contradiction message from the coordinator in a certain time period they should assume all ok, the final-outcome message should be send in both failure and successful cases...
 
As it is at present, the contradiction is a bilateral matter between the superior (coordinator) and inferior (participant) - and that is the only relationship the participant is aware of (plus any lower relationships it may have if it is a subcoordinator) - the Participant is unaware of its siblings. It seriously changes the implicit contract if we make the participant aware of the siblings - and in a direction away from the inter-organisational target of BTP, back towards classic transaction systems (where the whole lot is "owned" by one entity, linked in a transaction for their mutual benefit)
 
 
As it says in the spec, the persisting of the "decide to cancel autonomously" might not actually require a disk write. The participant had to write the "decide to be prepared" record, and if this contains the timeout information, then the mere presence of an expired record means the autonomous decision must have been made. (the implementation has to "know" that it would have removed the record if it had confirmed).
 
What about the confirmed participant.. does it also retain the logs...and how long..
 
It is required to retain logs until just before it sends back a reply to the superior (CONFIRMED/response in the tables) - it must not have an unmodified prepared log when it sends that. (However, that is a logical log-removal, not necessarily a real one. The critical point is that, if the participant were to crash and recover, any log record would NOT cause the participant to query the superior and, if receiving a SUP_STATE/unknown, then treat that as a cancel. Depending on how the underlying resources work, it is possible this doesn't need any real changes to the log records as such.)
 
I know there are a lot of implications on the protocol by adding a final-out come message... I will read the new spec that Alastair is sending before going into details... and perhaps some of the issues are already covered...
 
Peter
 
Have a nice weekend. 
 
And you 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC