[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ws-tx] Issue 052 - WS-AT: Replay message generates protocolerrors
Alastair Green wrote: > Hi Mark, > > Just one point: > > Mark Little wrote: >> Since it crashed in Prepared Success state we should be able to >> assume that the participant obeyed the rules and made its decision to >> be able to commit durable. Hence, this Replay message should be >> interpreted as a), though the semantic of "have recovered" shouldn't >> exclude the fact that the failure may have been in the network and >> not the participant service itself (for instance). >> > One might think so, but in fact when the Participant experiences a > comms time out it Resends Prepared (PV state table). > > Which begs the question: if that works for comm failures, why do we do > something different for process failures which are recovered? Different type of failure. I interpret the resend of Prepared on comms failure to be in the case where the sender (the participant) knows that the original Prepared wasn't delivered. My original statement above referring to comms failures is more: there was a network partition after Prepared was successfully delivered and this partition has been healed. In the meantime, the coordinator committed, couldn't contact the participant because of the network partition and so must go into some form of recovery mode. From the coordinator's perspective, there is no way for it to distinguish between a network partition and the failure of the machine on which the participant resides. From the participants perspective, there is a difference, though the resolution is the same: it initiates a Replay message. I just wanted to make sure our definition of failure didn't preclude partitions. > > The implication of the two events for the Coordinator, as you point > out, should be identical (we are ensuring that the Prepared Success > state is communicated to the Coordinator). But these are different scenarios. As a slight (related) aside: the OTS works fine with replay_completion on the RecoveryCoordinator, so there is precedent for Replay. Mark. > > Alastair > > Alastair
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]