[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Your proposal
> I think you are assuming a wider scope for what I am suggesting. What I am > suggesting is not trying to resolve problems of entire application logic or > workflow etc. It simply is a help for participants of the atoms to recover > independently, it is not a recovery of the workflow, business level > agreements and application. My point is that they did not participant in the application independently, so why should they be able to compensate independently? Try looking at the application as a whole, and not individual components of it, please! If BEA want to do this as an added value feature, then they are free to do so. The requirement for this is not proven, and from the general availability of workflow systems I can probably wheel out enough counter arguments. If you (BEA) do this and there is a mass take-up of this idea then we should obviously revisit it. However, as I have tried to point out time and time again, this will impose a significant requirement on users and implementers of BTP. We should not be adding it simply at a whim. Now, if what you want is simply to *inform* participants that the transaction has terminated, and not specify what they can do based on this, then that's a different matter. In the OTS specification, for example, there are equivalents of BTP participants (called Resources), and special participants called Synchronizations (obviously the same end-point can do two rolls if it wants). These Synchronizations are informed before the transaction starts (before_completion) to complete, and can have an affect on that completion by forcing it to rollback. They are then informed when the transaction completes (after_completion), and can have *no* effect on the transaction outcome. In addition, information about them is not maintained persistently by the coordinator, so it does not need to write them to its intentions list, and if it fails, they don't get informed. The OTS specification (and JTA which took them on board as well) does not say what the Synchronization can do when it receives an after_completion invocation, and it should not. However, if I really wanted to I suppose I could compensate in that. It would break the transaction model and the entire application semantics, but hey, if I'm so sure the user wants that then why not? Saves having to write/OEM a workflow system. > > The **fact** is that there will be some situations where some participants > of **atoms** are confirmed some canceled. **Agreed** > When this happens the coordinator > already marked this atom 'not confirming' And this is where we may start to diverge in what we think is going on. If PREPARE has been sent to all participants and some have CONFIRMED and then some CANCEL, the atom is not marked as "not confirming". It will CONFIRM and a CONTRADICTION will have occurred. It's the same as in a transaction system where some participants raise heuristics: the transaction may well still commit and knowledge about the heuristic may be propagated back to the terminator. If a participant CANCELs during PREPARE, then the other participants will be told to CANCEL too. Obviously if some of them independently CONFIRMED then they will have caused the CONTRADICTION, and not the ones that CANCELED. > thus the business process or > workflow or simply the client knows the fact, it is not going to assume > opposite, it knows that A*azon is not going to send the book! No it does not. See above. Please verify at what stage during the 2PC you assume the CANCEL happens. Is it synchronous or asynchronous (i.e., independent)? > so that it can > choose another atom, e.g another book store - but by letting know A*azon > that the transaction is failed we will be helping them not to send the book > because you are going to p*ssed off when you get a bill for the book that > you did not order - you already chosen another atom, another bookstore! Hmmm, sounds like a good place for a workflow system then. Please take a look at some of these - there are quite a few in the market, and you will see how applicable they are. > Are > you suggesting that even if some participants of atom canceled coordinator > should confirm the atom?, if so how many of cancel will be enough to let > coordinator understand that the atom actually does exist anymore? I think you have a misunderstanding of the way in which the atom works. Let's take the simplest case first: (i) During phase 1 it sends PREPARE to all participants. Now suppose they all say PREPARED and none will act independently. Now the coordinator sends CONFIRM and if they all CONFIRM then everything's great! Now suppose that for one reason or another a participant could PREPARE, but couldn't CONFIRM at the end (e.g., the hard disk crashed). If this is the first participant the coordinator sends CONFIRM to it will be able to change its decision to CANCEL, and send CANCEL to all other participants (let's assume also that they do as they are told). In this case, the atom as a whole CANCEL-ed. Now, suppose that this rouge participant is someway down the coordinator's intentions list. It has sent CONFIRM to some participants and they have CONFIRMED and "gone away", so it can't undo this. And in fact out of N participants it may only be 1 that cannot CONFIRM after PREPARE and this may be number N/2. So rather than CANCEL all other participants after we reach number N/2, what TP systems typically do is continue on with the coordinator's decision to CONFIRM the others and remember (durably) the failed participant(s). The transaction (atom) has still CONFIRM-ed though. What we do with the failures will depend upon the type of failure. For example, if it was a transient failure such as a comms failure, then the coordinator may periodically try to CONFIRM the participant. If it was a definitive answer from the participant that it could not, and can never, CONFIRM, then this is a heuristic, and it is reported to the terminator application to deal with. Now let's take the slightly more complicated scenario of independent confirms: (ii) the coordinator sends PREPARE to all participants. No participant can independently CONFIRM until it has received a PREPARE, but it can CANCEL prior to PREPARE and in which case the atom must CANCEL too. If some PREPARE-d participants can't CANCEL as a result then they have caused a heuristic, and see above. When the coordinator sends CONFIRM, say, to the participants, if some of them say they have already CANCEL-ed (because, for example, the coordinator was too slow in making the final decision) then it's pretty much as (i), i.e., depending upon where the participant is in the intentions list we will either have an entirely CANCEL-ed atom, or one which is CONFIRM-ed and has possibly got a heuristic. Can you just confirm that you agree with all of this? > > I am **assuming** that similar situation will occur enough that requires > some thinking to find a solution to **reduce** the inconsistencies that may > occur. Yes, and that thinking has been done by various workflow and process flow people over the years. > This proposal, specially will **help** to the participant that is > confirmed while the **atom** failed. But as I keep saying, it is not up to the participant to independently decide that it can compensate itself when it told the *application coordinator* that it had confirmed. As far as the coordinator is concerned, the participant has confirmed and will never un-confirm. If this is not the case, then we need to run another completion protocol (more phases!) between the coordinator and these participants so that the coordinator can return a *definitive* answer to the invoker about what happened. Or is this not important in your scenarios? I know our customers would find it really useful to know the final outcome. > BTP does help the canceled participant > by sending CONTRADICTION message already - but does not require any actions > to eliminate the contradiction. No, and that is the right thing for it to do because it will be highly application/resource dependent on how to resolve this. Take a look at TP systems. > It is also clear that this problem may be > attempt to be resolved by asking to the canceled participant to > re-think/revise its decision of canceling, because there are others already > confirmed which I think this is what Keith Weir was suggesting. I think this > second way of resolving the contradiction is valid but more cumbersome than > just letting the confirmed participants know the results of the atom and > take necessary actions whatever it might be (the atom is already canceled). That's a different situation, but one which we could consider in a revision task force. > The best solution would be to require all the confirmed participants keep > the log until the 'complete' message arrived. This way it will be a complete > recovery for the atom and how the individual participant recover is not > concern of atom coordinator. But an optional qualifier in the CONFIRMED > message may do the job - I am assuming no participants wants to be doing > inconsistent work thus they all will set such qualifier (note that if such > qualifier exist coordinator should honor the request). I disagree, but then that shouldn't come as a surprise ;-) It is too late to add such a significant modification IMO. Let's get this specification adopted now, and people can then use it. That's the only way we can resolve many of the outstanding "niggles" that people have: prove them through use cases. > > Shortly, > 1) It is a fact that this situation will occur (I feel it will > happen more than you think), As I said in an earlier email, this is all conjecture at the moment. > 2) It is clear to me that it is not an attempt to alter business > logic, it is a generic attempt to beware the consistency at atom level. We > have relaxed/reduced 'I' and 'D' of ACID but should keep 'C' as much as > possible - after all the protocol is to create some degree of consistency! No, it is *exactly* an attempt to alter business logic. You tell me how the independent compensation of a participant without recourse to what the driver of the application wants is not anything other than such an alteration?! > 3) There are other alternatives that address the same issue (let the > contradicted participant revise its decision), but I think are more > cumbersome, and at the end it may still need to let confirmed participant > involve.. The best "attempt" is to use workflow layered on BTP. > 4) There are some performance penalty, I am not sure how much, needs > to be clarified, 5) The best solution would be as suggested > originally - all the participants keep the log around until a final_outcome > message received (not necessarily lock or not to do the job, they may > already have done the work), "best" is definitely subjective. > 6) The suggestion from Alastair and Peter (with minor modifications > for coordinator honoring every request for a final_outcome) will satisfy the > need since I am assuming all participants will be interested in requesting > the final outcome message! > > Looks like I am repeating (like a broken record!) the same explanations... > hope I have been able to answer some of your questions and concerns. Not quite! Mark. ---------------------------------------------- Dr. Mark Little Transactions Architect, HP Arjuna Labs Email: mark@arjuna.com | mark_little@hp.com Phone: +44 191 2064538 Fax : +44 191 2064203
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC