[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed IssueAnnouncement)
Then, it seems to me that we are converging.
What we need now is a new proposal with exact wordings and clear description of the new semantics.
Satish Thatte wrote:
If I understand your first question correctly, that was my notion of the convert-terminate-to-fault-and-continue behavior. And then yes, the failure could be capped to a scope, since the "modeling" fault at that point will be treated like any other ordinary fault. ________________________________ From: Alex Yiu [mailto:firstname.lastname@example.org] Sent: Fri 2/11/2005 11:49 AM To: Satish Thatte Cc: email@example.com; Francisco Curbera; Prasad Yendluri; Danny van der Rijn; firstname.lastname@example.org; email@example.com Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement) Hi, Satish, I guess I undestand you point ... More questions: Say: If we don't call it "fault", after the process is "freezed" ... and after user-inspection, he/she consider the fault should not affect the compensation logic, can the user select the action to activate a fault handler of a related scope which does the compensation logic and marked the scope faulted and continue rest of the process? It is important to cap the system failure to one of child scopes, not the whole process, for fault-tolerant design [ Oh my ..... the term "fault" comes again ... do we really want to avoid that term? ] Thinking out loud again: maybe we should still call them as fault and have a clear explanation on how system failure will be handled differently from an application fault? Regards, Alex Yiu Satish Thatte wrote:Alex, I agree with what you say except I would rather not call it "fault" because a normal fault does not cause a process to freeze. Our terminate semantics is as close to a freeze as possible already. But if we want to rename terminate as something else (actually didn't we rename it exit already?) that captures the intent better I have no issues with that. As for how the intention is expressed, that will clearly have to be platform specific. We don't have any official notion of deployment descriptor, but it would have to be some sort of extension or external configuration parameter, which I think is what you intended to say. Satish ________________________________ From: Alex Yiu [mailto:firstname.lastname@example.org] Sent: Thu 2/10/2005 8:45 PM To: Satish Thatte Cc: email@example.com; Francisco Curbera; Prasad Yendluri; Danny van der Rijn; firstname.lastname@example.org; email@example.com Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement) Hi, Satish, If I read Satish's comments correctly, then I would say it is more fair to say: The semantics on how to handle a BPEL fault no longer is "exit"/"quit"/"terminate". The process basically "freezes" / "suspend" before any further code execution. Then, it is up to the BPEL implementation / BPEL site admin / BPEL developer to decide what to do with this "freezed" or "suspended" process. And, I may add more question: May their decision be just the plain old default "compensate and rethrow" semantics in BPEL 1.1? Can their decision be expressed by a deployment descriptor? or extension attribute in BPEL? Regards, Alex Yiu Satish Thatte wrote: There are two points at issue here. 1. Are undefined-runtime-semantics "faults" really faults in the sense that one would write specific catch handlers for things like conflictingReceive, or correlationViolation in the same way as one would write catch handlers for approvalDenied? 2. Admitting that undefined-runtime-semantics "faults" will occur since we do not mandate pessimistic static analysis to prevent them, what exactly is a reasonable way to deal with these "faults"? I would hope that we have no disagreement that specific handlers for correlationViolation and such would be extremely rare. CatchAll is the way these "faults" would be intercepted if at all. And in that context there is very little one can do except suppress the fault, i.e., limit its impact, and possibly notify someone that it happened. I have not seen anyone argue otherwise. The primay disagreement seems to be about the second question, and especially about the tradeoff between the approaches of A. Explicitly define impact boundaries ("modularity" entered the discussion as an example for such boundaries) even for undefined-runtime-semantics "faults" and within those boundaries apply the usual unravel and compensate logic that gets applied by default. B. There is no reasonable way to define the impact boundaries in most cases and in a lot of important processes the usual unravel and compensate logic would create unintended havoc and destroy years of work if blindly allowed to proceed by default and oversight. By the way, neither approach helps as far as letting a partner know what is going on in cases like missingReply. For that we would have to go back to my suggestion of explicitly declaring MEP instances in scopes and then defining standard wire-faults in case an MEP instance went out of scope without completing. To be clear, I am *not* suggesting we go down that road at this point. I don't think we can settle this with arguments based on examples because "allowing ordinary compensation to proceed" can be viewed as being either desirable or disastrous depending on the scenario you have in mind. I disagree with Yaron that his setting#1 which corresponds to my approach B is possible today without preventing the BPEL engine from actually carrying out prescribed runtime semantics. But I agree with him that the two approaches need to be made possible via some platform-specific switch, i.e., made compatible with BPEL normative semantics. One way is to extend our notion of "terminate" to include optional fault data. I would then argue that a BPEL engine is free to provide a (private) switch that chooses between terminate-then-optionally-repair-and-continue behavior as well as auto-convert-terminate-to-fault-and-continue behavior. Satish -----Original Message----- From: Yaron Y. Goland [mailto:firstname.lastname@example.org] Sent: Monday, February 07, 2005 12:13 PM To: Francisco Curbera Cc: Prasad Yendluri; Danny van der Rijn; email@example.com Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement) I think the core of the problem is another part of our ever increasing elephant. Lots of systems are going to have a magic switch that I strongly encourage us not to attempt to specify in BPEL both because it's at least 80% out of scope and because it will take a long time to agree on the semantics. That switch will specify (either on a process level or perhaps a scope level) what to do if certain kinds of faults are thrown. One of the key faults this switch will focus on are system faults. This switch will typically have at least two settings. Setting #1 - If a system fault is thrown immediately freeze the process and call the admin for help who can then edit the process to fix things. Setting #2 - If a system fault is thrown then send a note to the admin but let the fault go through the normal fault handlers. Both the first and second settings are possible with the existing spec. The first behavior through an out of scope operational override and the second behavior is pretty much our default behavior. Issue 190 would make the second setting effectively impossible since it would be illegal to ever allow system faults to go through normal fault handling. But as Alex and others have convincingly argued there are many interesting cases in which it makes sense to allow system faults to go through normal fault handling. In terms of maximizing portability I think we should stick with our current behavior and leave the 190 style behavior to out of scope extensions. Yaron Francisco Curbera wrote: I guess one of the points of the immediate termination condition is that termination is essentially always invisible to partners of the process. The net effect of this change (and from my perspective the actual aim of this proposal) would be to allow engines the flexibility to deciding how to deal with these situations, termination being an option. Any form of standard fault semantics limit that flexibility because the engine would be forced to follow the usual scope termination/fault propagation behavior with likely the result of discarding many recoverable process instances - and posisble days or months of process work. Paco Prasad Yendluri <pyendluri@webmet To: Francisco Curbera/Watson/IBM@IBMUS hods.com> <mailto:pyendluri@webmetTo:FranciscoCurbera/Watson/IBM@IBMUShods.com> cc: Danny van der Rijn <firstname.lastname@example.org> <mailto:email@example.com> , firstname.lastname@example.org Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue 02/04/2005 02:30 Announcement) PM Hi, 1. Isn't this the same issue as the one raised by issue 187 where we ask if there are any constraints in handling of the standard faults? This is proposing a specific resolution where it is recommended that the process always terminates immediately. 2. I tend to side with Danny on this. I don't think we should require that the process terminates immediately always. IMO in at least certain cases this may not be a fatal situation for the whole process (it could be confined to the scope) and other parts of the process may be able to continue by compensating for pertinent. Perhaps the impact could limited to the immediately confining scope and the process could continue, perhaps the area the fault occurred could be non-fatal to whole process (e.g. related look-up rather than modification of any information) or caused by some transient condition that could go away on a retry etc. I think the process (fault handler) should be given a chance to handle the situation rather than terminate always. 3. If we do end-up going the "terminate" always way, we must minimally *not* preclude logging the condition, which could be more intelligent if the faults could be attached some "fault data" (ref issues 187 and 185). Regards, Prasad -------- Original Message -------- Subject Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue : Announcement Date: Fri, 4 Feb 2005 13:23:17 -0500 From: Francisco Curbera <email@example.com> <mailto:firstname.lastname@example.org> To: Danny van der Rijn <email@example.com> <mailto:firstname.lastname@example.org> CC: email@example.com Hi Danny, BPEL so far does not support any technique for modularizing process authoring, so the situation you describe is a bit out of scope right now. In any case, my view is that the idea that authors of business process are going to be adding code to deal with things like unsupportedReference is just not realistic. I would even argue that those faults don't actually belong at the BP modeling level and need to be dealt with in a different way. Dieter's suggestion allows implementations to manage these situations in the best possible way. This is specially important in the case of long running processes, where months or years of work can be thrown out the window when one of these faults is encountered (the current semantics require the complete unwinding of the execution stack if the fault is not caught and a generic catch all is essentially good for nothing). Typically you want to allow manual intervention to figure out whether the process can be repaired, terminated if not. Paco >From: Danny van der Rijn >To: firstname.lastname@example.org >cc: >Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement 02/03/2005 01:47 PM [Resending this with appropriate header to save Tony/Peter the trouble] -1 As I pointed out in our last face to face, this kind of approach will make any kind of modularization extremely difficult. It will give no way for a developer of a piece of BPEL code to protect against the "modelling error" (legacy term: "programming error") of another modeller whose attempt to model the real world failed in a tangible instance. Danny Tony Fletcher wrote: This issue has been added to the wsbpel issue list with a status of "received". The status will be changed to "open" if the TC accepts it as identifying a bug in the spec or decides it should be accepted specially. Otherwise it will be closed without further consideration (but will be marked as "Revisitable") The issues list is posted as a Technical Committee document to the OASIS WSBPEL TC pages on a regular basis. The current edition, as a TC document, is the most recent version of the document entitled in the "Issues" folder of the WSBPEL TC document list - the next posting as a TC document will include this issue. The list editor's working copy, which will normally include an issue when it is announced, is available at this constant URL. Issue 190: BPEL Internal Faults Status: received Date added: 3 Feb 2005 Categories: Fault handling Date submitted: 3 February 2005 Submitter: Dieter Koenig1 Document: WS-BPEL Working Draft, December, 2004 Related Issues: Issue 163 : languageExecutionFault, Issue 169 : Transition condition error handling clarification, and Issue 187 : Legality of Explicitly throwing or rethrowing Standard faults. Description: There are a number of cases in the current spec where the behavior of a process is described as *undefined*, in particular, after recognizing internal errors described as standard faults. With the exception of "bpel:joinFailure", *all* of these situations represent modelling errors that cannot be dealt with by the business process itself in a meaningful way. This behavior becomes even more questionable for catchAll handlers that try to deal with multiple application faults and unexpectedly encounter a standard fault. Submitter's proposal: Instead of allowing processes to catch these as standard faults, we propose that the process instance must *terminate* immediately when such a situation is encountered. The behavior of terminate is well-defined in BPEL -- as far as BPEL is concerned the instance execution ends when terminate is encountered without any fault handling behavior. Any additional facilities for extended support for, e.g., repair and continue, is definitely out of scope. This approach would also create a clear direction for dealing with any pathological situation within an inlined language (Issue 163) and therefore also for errors within transition conditions (Issue 169). Changes: 3 Feb 2005 - new issue Best Regards, Tony To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgr oup.php. To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgr oup.php. To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgroup.php.To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgroup.php.