[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Coordinator timeouts (was Re: Managers, addressses and the like)
> I agree with your substantive point on the nature of coordinator > timeout. It does not mix well with participant timeouts, on reflection. > > I have some questions and issues, however. > > I presume that the coordinator timeout is transmitted in the CONTEXT, > and that it is in some way communicated to participants. I presume that > services that forward the CONTEXT are allowed to trim the timeout to > avoid excessive prolongation (minor point). Yes. > Here's where it gets a little less obvious. If a participant gets the > timeout and it expires then it is only advisory: if the participant is > happy to hang around and hold resources after the expiry, that is its > business. It will never go wrong, get a wrong result. It could decide to > send an INFERIOR_STATUS to the coordinator, and get back an Active > status, in which case it might decide to hang on. If the status messages > included a timeout value then it could receive an updated timeout. In > other words, this coordinator timeout could be viewed as a hint (can't > really be viewed as anything else). If a coordinator timeout goes off then the participant could hang around if it so wants. However, it cannot make an independent CONFIRM choice. If it doesn't want to hang around then it can only CANCEL. This is different to the participant-specific timeout, where I assume it can go eitherway. > The minutes from Mt Laurel are accurate. The decision concerned > participant timeouts, and is accurately recorded. The point that you > raised there on participant timeouts, which was agreed to, is also > recorded, namely that the PREPARE can be qualified with a "minimum upper > bound" on the participant timeout, sent from the Coordinator to the > Participant. > > There was no decision or concrete proposal about coordinator timeouts. > It was mentioned in discussion, but mentioning things doesn't make them > decisions, nor did the minutes attempt to record every issue that was > mentioned. I disagree. Your minutes are inaccurate. Simply because you did not make a note of it does not mean that the discussion did not take place. It did, and it was about propagating coordinator timeouts to subcoordinators. > Every thing I minuted as agreed was read over to the meeting > before a vote was taken, and before I noted it, sometimes verbatim, > sometimes very close to verbatim when we hadn't arrived at the > absolutely precise wording. I think that it certainly was not an officially voted-on decision, but it was discussed and should have been minuted. Apologise for not noting this oversight in the minutes earlier. > A side comment on participant timeouts: I think that you will not avoid > the need for failure recovery in participants by participant timeouts, > if that is what you have in mind. No it isn't for all situations, but it is for others - remember we are using a "presumed abort-like" protocol. > Also, the motivation for participant > timeouts is that a coordinator should not be able to wrest control of > time-sensitive data or locks from its owner. Denial of service is only > one special case of this. Expiry of offers is actually a much more > likely use, in my view. As I said in earlier emails, there are two different timeouts that we should support, and both serve different roles. The coordinator is not trying to wrest control over data from a participant with the coordinator-timeout: it's trying to prevent a number of things: (i) the situation where a resource hasn't been prepared and the coordinator fails. There may well be resource implementations that would happily sit around for years as long as they haven't got to prepare. After a week of inactivity (for example) they may periodically probe the coordinator to find out what's going on. If I can cut down on that message with a timeout meaning an implicit failure, then I'd like to do that (and I would expect you would too, given the amount of email there has been on boxcarring!) (ii) the coordinator is created by an initiator who then fails before prepare has been sent to it. I don't want to have to write initiators who must keep persistent information on all coordinators they are using before they ask them to prepare. I could certainly do that but it's a performance bottleneck, and the critical point for saving the coordinator information is at prepare, and not before. If my initiator fails prior to prepare then I'd quite like the system to tidy-up for me, i.e., undo the coordinator automatically. With a timeout, there's an implicit CANCEL message that never needs to be sent. Mark. ----------------------------------------------------------------------- SENDER : Dr. Mark Little, Architect (Transactions), HP Arjuna Labs PHONE : +44 191 222 8066, FAX : +44 191 222 8232 EMAIL : mark@arjuna.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC