Re: [wsbpel] Issue - 274 - Proposal to Vote

wsbpel message

Subject: Re: [wsbpel] Issue - 274 - Proposal to Vote

From: Alex Yiu <alex.yiu@oracle.com>

To: Mark Ford <mark.ford@active-endpoints.com>

Date: Wed, 10 May 2006 20:38:10 -0700

Title: [wsbpel] Issue 274 Proposal to Vote

Hi Mark,

Thanks for the quick reply.
Yes, we should converge to a proposal soon.

Maybe, I can stress few more things:
My first point and third point actually go together. Let me rephrase some of my points. Hopefully the linkage in my logic is clearer.

Point [1]

The missingReplies error is actually resulted from the modelling mistake or unanticipated runtime failure in the CH of the child scope, not the FH of the parent scope. Fault at the CH level is more precise.

Point [3]

Consider case [a]: a CH forget to initialize a variable before using it. And, that is a modelling error in the CH. Then, this modelling error causes uninitializedVariable to be thrown from the CH to the "calling"/"invoking" fault handler.

Now consider case [b]: a CH forget to reply after receiving a request. And, that is a modelling error in the CH.

Should we treat these two cases of modelling error symmetrically? That is, causes a missingReply fault to be thrown from the CH to the "calling"/"invoking" fault handler.

Or, we would NOT detect this modelling error as early as possible simialr to the case of uninitializedVariable. And, let the calling FH continue until the end of it?

This is the asymmetry treating modelling errors related to missingReplies and other fault (e.g. uninitializedVariable) concerns me. This asymmetry applies to one more bullet below.

Additional points:

[4]
So far the existing checking logic for missingReply (described in the previous four bullets in this section) are to check orphaned IMA as early as possible. If we check it at the end of CH, it would be more consistent.

[5]
Mark Ford wrote:

Lastly, the syntax below for providing <catchAll> around the <compensateScope> only allows continuation for other targets and not other instances within the same instance group. Granted, the language for faults within an instance group cause it to short circuit but I go back to my point above about the difference between a fault and the check for the condition.

Similar asymmetry plays here. If you encounter an modelling error within a CH other than missingReply, it will also stop other CH instances within the same group from executing anyway. If you really want a comprehensive solution (instead of special treatment of missingReply) for all fault situations, it may be better to have suppressFault="yes" in a compensation activitiy. That is, when a fault happens within a group, the fault will be suppressed and other instances of CH can continue. But, that is a new switch feature to Dieter's proposal on compensation. It may be better delayed to next version of WS-BPEL spec.

And, I would argue that when a bunch of CH instances are grouped together, they may tend to be related. Hence, it may be more useful to stop other instances of CH to be execute. (Think of the situation the CH-instance-group is created by a <while> loop).

[6]
The semantics of checking at the end of FH does not allow the behavior of stopping the calling FH from the further execution immediately. On the other hand, in the case of non-CH-instance-group situation, checking at the end of CH will allow us to choose: to end the calling FH immediately or to suppress the fault and continue with other CH. That is: faulting at CH level is more flexible. Particularly, if one defines a suppressFault="yes".

I would also appreicate additional feedback on these 2 variants of proposal.
Thanks again!

Regards,
Alex Yiu

Mark Ford wrote:

Alex,

I agree with your first point in that this is a modeling error or unanticipated runtime failure.

I am not in complete agreement with the second. The compensation handler is an optional continuation of the scope but the scope is in a different state during its compensation so it's not so strange to me to have different rules for checking for error conditions. Since the scope's fault handlers have been uninstalled then I consider it the responsibility for an activity outside of the scope to detect the fault.

I don't agree with your third point. I view the faults in bpel as being raised as soon as they are detected by the engine. This detection process occurs through the execution of an activity, evaluation of a link, receipt of data ...etc. In the case of bpel:missingReply, there is no fault until the check is made. The previous four bullets in this section have normative language about when the check for the orphaned IMA's is made, not how the fault is handled.

That said, I think your most compelling argument for having the CH check for the orphaned IMA's is that the compensation work for multiple scopes may be related so you would want the compensation handler to check for orphaned IMA's fault immediately. On the other hand, they may not be related so it could be better to let it continue.

Lastly, the syntax below for providing <catchAll> around the <compensateScope> only allows continuation for other targets and not other instances within the same instance group. Granted, the language for faults within an instance group cause it to short circuit but I go back to my point above about the difference between a fault and the check for the condition.

In any case, we need to close this issue on the next call so I'd appreciate any additional feedback or other input from the group.

Thanks.

From: Alex Yiu [mailto:alex.yiu@oracle.com]
Sent: Wednesday, May 10, 2006 4:26 PM
To: Mark Ford
Cc: wsbpel@lists.oasis-open.org; Alex Yiu; Danny van der Rijn; 'Dieter Koenig1'
Subject: Re: [wsbpel] Issue - 274 - Proposal to Vote

Hi Mark,

I guess you would not be surprised that I still prefer detecting orphaned IMA at the end of compensationHandler. Reasons are:

The missingReplies error is actually resulted from the modelling mistake or unanticipated runtime failure in the child scope (which the CH is attached), not the FH of the parent scope. Fault at the CH level is more precise.

From section 12.4.2: "This is because their compensation handlers are still available, and therefore the execution of such scopes may continue during the execution of their compensation handlers, which can be thought of as an optional continuation of the behavior of the associated scope." I tend to interprete that the primary (normal) activity of the scope is its part #1, while the activity of the CH of that scope is its part #2. Their nature should be symmetrical in the counter-working way. If we perform certain checks in part #1, similar checks should happen in part #2.

The asymmetry between missingReplies and other fault (e.g. selectionFailure) concerns me. (This is the last but the most important point.) If there is a problem in the CH logic that triggers a fault (e.g. selectionFailure), it will be propagated from the CH to the corresponding compensation activity in FCTHandler (fault, compensation and termination). If there is no fault handling around the compensation activity, the whole FCTHandler will not continue. But, now we are saying that the FCTHandler will continue until its end, if the error condition is the missingReplies fault? This asymmetry may be surprising to users. I would say it may the process more difficult to model.

Let me use one more example to illustrate my preference. If one wants to continue the logic in a FH including continuing other compensation activities, one want to add a scope

That is changing from:
---------------------------------------
<catch ... >
   <sequence>
        <compensationScope target="A" />
        <compensationScope target="B" />
        <compensationScope target="C" />
   </sequence>
</catch>
---------------------------------------
to:
---------------------------------------
<catch ... >
   <sequence>
        <scope>
            ...
            <catchAll> <empty/> </catchAll>
            ...
            <compensationScope target="A" />
        </scope>
        <compensationScope target="B" />
        <compensationScope target="C" />
   </sequence>
</catch>
---------------------------------------

This <catchAll> will handle all kind of faults, not just missingReplies.

Again, it will allow BPEL process definition to have a finer grain of control on how to handle missingReplies fault.

Also, **if** the work of "A", "B" and "C" are highly related to each other, I would say that it is actually more common for the process designer to prefer the compensation work stop immediately, if the compensation of "A" failed. That will avoid any propagating any strange states from "A" to "B" and "C". That is also consistent with the design of Compensation Handler Instance Group. That is, if one instance fails within the group, other instances (not started yet) will not be attempted.

I hope I did a better convincing job this time. :-)

More thoughts?
Thanks!

Regards,
Alex Yiu

Mark Ford wrote:

In dealing with an orphaned IMA within a compensation handler, it seems to me that there are two possible resolutions. Either the compensation handler detects the orphaned IMA's and faults or the detection of orphaned IMA's is deferred to the fault handler or termination handler that invoked the compensation handler. I am in favor of the latter since it allows compensation to continue even with orphaned IMA's. There is very little that can be done with these orphaned IMA's so we may as well allow the compensation logic to proceed as best it can and defer its fault to

I have reworded my original proposal to avoid introducing any new terms as per Danny's suggestion. The approach is still the same in that the detection of an orphaned IMA is NOT made by the compensationHandler.

Add a fifth bullet to Section 12.2 which reads as follows:

No checks for orphaned IMA's are made when a compensation handler completes. The compensation handler's execution must necessarily start from within a fault or termination handler so any orphaned IMA's created by a compensation handler will be detected and handled as described in the above bullets.