[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: 2001-06-01_OASIS_BTP_abstractmessages_DRAFT_1
> Yes, there is such a risk. If a one-shot reply carries > enroll (no-response) + app response + vote/ready/confirm/10 > but that gets lost (perhaps the initiator and coordinator crashed) and the > timer goes off, the participant will confirm on its own. It's not dependent on a one-shot. If the participant sends an out-of-band VOTE and that is lost (or not acknolwedged) then these issues still arise. If I want to optimise recovery so that it is driven from both the coordinator and the participant it would mean that the coordinator has to record information prior to prepare, such that if it fails and recovers, it knows who to contact. > It will still > attempt to run the recovery sequence (I'm doing the state table details > there at the moment) and detect there is a problem, even if the > initiator/coordinator have no surviving information. I think this comes down to what we began to discuss at JavaOne: what kind of boxcarring is legal. Also, where does recovery get driven from? We've been thinking quite a bit about this and I'll have the promised text on redirection/recovery for Monday. > > This is only a risk where there are autonomous confirms (and in this case > with spontaneous ready). Autonomous and unacknowledged. > I don't think there is any general protocol cure possible - by avoiding > various features of the protocol, you can ensure it can't happen in a > particular case (if location of participants is pre-known, spontaneous votes > are not used, autonomous confirm is not allowed etc.), but I think the above > is inevitable for the total combination. If we have an ack message then participants can work in a safer mode IMO. No ack and a subsequent cancel means no heuristic. Ack and cancel/confirm means the participant must stay around to find out what happened (and it knows the coordinator has recorded sufficient information to eventually get back in touch somehow). > Or at least the cure will be worse > than the disease (essentially not allowing certain setups to use BTP because > they can't meet the demands for business reasons) I disagree. I don't think adding an ack for VOTE is such a major modification to the current protocol. In some messaging implementations it may well be implicit anyway. > > > So, > > the coordinator must make persistent all information about participants > > before it even enters the prepare phase so that it can contact them in the > > event of failures and find out what they did. I certainly wouldn't want to > > not know that an autonomous decision was made contrary to my preferred > > outcome. > > with spontaneous VOTEs, you would have to make the information persistent > before sending the context out - but in general you don't know the > participants then anyway. Hmm - perhaps that won't matter - if a > coordinator (manager) receives an incoming INFERIOR_STATUS/confirmed for an > atom that cancelled or is unknown, then it knows there was a contradiction. > It may not know what the transaction was about (or who the participant is), > but it knows something has gone wrong. Agreed. It's a matter of who has the most responsibility for determining there has been a problem, and what has to happen in the case of recovery. A valid implementation of recovery may want to drive everything from the coordinator; in which case, with acks, it must record information about all participants prior to prepare, such that it can tell them to rollback. Another implementation may want to drive recovery entirely from the participant; in which case (this is irrespective of the presence of VOTE acks) a rolled back participant which did prepare first, must record information about the coordinator so that it can call back to find out what happened to the transaction. I'm not against either of these implementations (and obviously a third is possible that combines the two). However, I'd like to see the possibility for the following implementation: if a coordinator hasn't heard from a participant before prepare it can roll back (cancel) without having to record information about it, safe in the knowledge that it won't cause any heuristics. If the coordinator got a VOTE and ack-ed then it becomes part of the recovery case. > I forsee some white board diagram next week :-) Let's hope the diagrams can also make it into the document :-)! Mark. ---------------------------------------------- Dr. Mark Little (mark@arjuna.com) Transactions Architect, HP Arjuna Labs Phone +44 191 2064538 Fax +44 191 2064203
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC