wsbpel message

Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed IssueAnnouncement)

From: Alex Yiu <alex.yiu@oracle.com>
To: Satish Thatte <satisht@microsoft.com>
Date: Mon, 14 Feb 2005 00:38:45 -0800

Great.
Then, it seems to me that we are converging.
What we need now is a new proposal with exact wordings and clear description of the new semantics.

Thanks.

Regards,
Alex Yiu

Satish Thatte wrote:

If I understand your first question correctly, that was my notion of the convert-terminate-to-fault-and-continue behavior.  And then yes, the failure could be capped to a scope, since the "modeling" fault at that point will be treated like any other ordinary fault.

________________________________

From: Alex Yiu [mailto:alex.yiu@oracle.com]
Sent: Fri 2/11/2005 11:49 AM
To: Satish Thatte
Cc: ygoland@bea.com; Francisco Curbera; Prasad Yendluri; Danny van der Rijn; wsbpel@lists.oasis-open.org; alex.yiu@oracle.com
Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement)

Hi, Satish,

I guess I undestand you point ...

More questions: Say:
If we don't call it "fault", after the process is "freezed" ...
and after user-inspection, he/she consider the fault should not affect
the compensation logic,  can the user select the action to activate a
fault handler of a related scope which does the compensation logic and
marked the scope faulted and continue rest of the process?

It is important to cap the system failure to one of child scopes, not
the whole process, for fault-tolerant design [ Oh my ..... the term
"fault" comes again ... do we really want to avoid that term? ]

Thinking out loud again: maybe we should still call them as fault and
have a clear explanation on how system failure will be handled
differently from an application fault?

Regards,
Alex Yiu

Satish Thatte wrote:

Alex,

I agree with what you say except I would rather not call it "fault" because a normal fault does not cause a process to freeze.  Our terminate semantics is as close to a freeze as possible already.  But if we want to rename terminate as something else (actually didn't we rename it exit already?) that captures the intent better I have no issues with that.

As for how the intention is expressed, that will clearly have to be platform specific.  We don't have any official notion of deployment descriptor, but it would have to be some sort of extension or external configuration parameter, which I think is what you intended to say.

Satish

________________________________

From: Alex Yiu [mailto:alex.yiu@oracle.com]
Sent: Thu 2/10/2005 8:45 PM
To: Satish Thatte
Cc: ygoland@bea.com; Francisco Curbera; Prasad Yendluri; Danny van der Rijn; wsbpel@lists.oasis-open.org; alex.yiu@oracle.com
Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement)



Hi, Satish,

If I read Satish's comments correctly, then I would say it is more fair to say:
The semantics on how to handle a BPEL fault no longer is "exit"/"quit"/"terminate".
The process basically "freezes" / "suspend" before any further code execution. Then, it is up to the BPEL implementation / BPEL site admin / BPEL developer to decide what to do with this "freezed" or "suspended" process.

And, I may add more question: May their decision be just the plain old default "compensate and rethrow" semantics in BPEL 1.1? Can their decision be expressed by a deployment descriptor? or extension attribute in BPEL?


Regards,
Alex Yiu


Satish Thatte wrote:


      There are two points at issue here.
     
      1.  Are undefined-runtime-semantics "faults" really faults in the sense
      that one would write specific catch handlers for things like
      conflictingReceive, or correlationViolation in the same way as one would
      write catch handlers for approvalDenied?
     
      2.  Admitting that undefined-runtime-semantics "faults" will occur since
      we do not mandate pessimistic static analysis to prevent them, what
      exactly is a reasonable way to deal with these "faults"?
     
     
      I would hope that we have no disagreement that specific handlers for
      correlationViolation and such would be extremely rare.  CatchAll is the
      way these "faults" would be intercepted if at all.  And in that context
      there is very little one can do except suppress the fault, i.e., limit
      its impact, and possibly notify someone that it happened.  I have not
      seen anyone argue otherwise.
     
      The primay disagreement seems to be about the second question, and
      especially about the tradeoff between the approaches of
     
      A.  Explicitly define impact boundaries ("modularity" entered the
      discussion as an example for such boundaries) even for
      undefined-runtime-semantics "faults" and within those boundaries apply
      the usual unravel and compensate logic that gets applied by default.
     
      B.  There is no reasonable way to define the impact boundaries in most
      cases and in a lot of important processes the usual unravel and
      compensate logic would create unintended havoc and destroy years of work
      if blindly allowed to proceed by default and oversight.
     
      By the way, neither approach helps as far as letting a partner know what
      is going on in cases like missingReply.  For that we would have to go
      back to my suggestion of explicitly declaring MEP instances in scopes
      and then defining standard wire-faults in case an MEP instance went out
      of scope without completing.  To be clear, I am *not* suggesting we go
      down that road at this point.
     
      I don't think we can settle this with arguments based on examples
      because "allowing ordinary compensation to proceed" can be viewed as
      being either desirable or disastrous depending on the scenario you have
      in mind.
     
      I disagree with Yaron that his setting#1 which corresponds to my
      approach B is possible today without preventing the BPEL engine from
      actually carrying out prescribed runtime semantics.  But I agree with
      him that the two approaches need to be made possible via some
      platform-specific switch, i.e., made compatible with BPEL normative
      semantics.  One way is to extend our notion of "terminate" to include
      optional fault data.  I would then argue that a BPEL engine is free to
      provide a (private) switch that chooses between
      terminate-then-optionally-repair-and-continue behavior as well as
      auto-convert-terminate-to-fault-and-continue behavior.
     
      Satish
     
     
      -----Original Message-----
      From: Yaron Y. Goland [mailto:ygoland@bea.com]
      Sent: Monday, February 07, 2005 12:13 PM
      To: Francisco Curbera
      Cc: Prasad Yendluri; Danny van der Rijn; wsbpel@lists.oasis-open.org
      Subject: Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed
      Issue Announcement)
     
      I think the core of the problem is another part of our ever increasing
      elephant.
     
      Lots of systems are going to have a magic switch that I strongly
      encourage us not to attempt to specify in BPEL both because it's at
      least 80% out of scope and because it will take a long time to agree on
      the semantics.
     
      That switch will specify (either on a process level or perhaps a scope
      level) what to do if certain kinds of faults are thrown. One of the key
      faults this switch will focus on are system faults.
     
      This switch will typically have at least two settings.
     
      Setting #1 - If a system fault is thrown immediately freeze the process
      and call the admin for help who can then edit the process to fix things.
     
      Setting #2 - If a system fault is thrown then send a note to the admin
      but let the fault go through the normal fault handlers.
     
      Both the first and second settings are possible with the existing spec.
      The first behavior through an out of scope operational override and the
      second behavior is pretty much our default behavior.
     
      Issue 190 would make the second setting effectively impossible since it
      would be illegal to ever allow system faults to go through normal fault
      handling. But as Alex and others have convincingly argued there are many
     
      interesting cases in which it makes sense to allow system faults to go
      through normal fault handling.
     
      In terms of maximizing portability I think we should stick with our
      current behavior and leave the 190 style behavior to out of scope
      extensions.
     
              Yaron
     
     
      Francisco Curbera wrote:
       

              I guess one of the points of the immediate termination condition is
                 

      that
       

              termination is essentially always invisible to partners of the
                 

      process. The
       

              net effect of this change (and from my perspective the actual aim of
                 

      this
       

              proposal) would be to allow engines the flexibility to deciding how to
                 

      deal
       

              with these situations, termination being an option. Any form of
                 

      standard
       

              fault semantics limit that flexibility because the engine would be
                 

      forced
       

              to follow the usual scope termination/fault propagation behavior with
              likely the result of discarding many recoverable process instances -
                 

      and
       

              posisble days or months of process work.
             
              Paco
             
             
             
             
                 

       

                                    Prasad
              Yendluri
                 

       

                                    <pyendluri@webmet        To:       Francisco
              Curbera/Watson/IBM@IBMUS                                           
             
                                    hods.com> <mailto:pyendluri@webmetTo:FranciscoCurbera/Watson/IBM@IBMUShods.com>                 cc:       Danny van der
                 

      Rijn
       

              <dannyv@tibco.com> <mailto:dannyv@tibco.com> , wsbpel@lists.oasis-open.org           
             
                                                             Subject:  Re: [wsbpel]
                 

      Issue 190
       

              - BPEL Internal Faults (New Proposed Issue            
             
                                    02/04/2005 02:30         
              Announcement)
                 

       

                                   
              PM
                 

       

                 

       

             
             
              Hi,
             
              1. Isn't this the same issue as the one raised by issue 187 where we
                 

      ask if
       

              there are any constraints in handling of the standard faults? This is
              proposing a specific resolution where it is recommended that the
                 

      process
       

              always terminates immediately.
             
              2.  I tend to side with Danny on this. I don't think we should require
                 

      that
       

              the process terminates immediately always. IMO in at least certain
                 

      cases
       

              this may not be a fatal situation for the whole process (it could be
              confined to the scope) and other parts of the process may be able to
              continue by compensating for pertinent. Perhaps the impact could
                 

      limited to
       

              the immediately confining scope and the process could continue,
                 

      perhaps the
       

              area the fault occurred could be non-fatal to whole process (e.g.
                 

      related
       

              look-up rather than modification of any information) or caused by some
              transient condition that could go away on a retry etc. I think the
                 

      process
       

              (fault handler) should be given a chance to handle the situation
                 

      rather
       

              than terminate always.
             
              3. If we do end-up going the "terminate" always way, we must minimally
              *not* preclude logging the condition, which could be more intelligent
                 

      if
       

              the faults could be attached some "fault data" (ref issues 187 and
                 

      185).
       

              Regards, Prasad
             
              -------- Original Message --------
             
                 

       

               Subject Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed
                 

      Issue
       

                     : Announcement
                 

       
       

                 Date: Fri, 4 Feb 2005 13:23:17 -0500
                 

       
       

                 From: Francisco Curbera <curbera@us.ibm.com> <mailto:curbera@us.ibm.com>
                 

       
       

                   To: Danny van der Rijn <dannyv@tibco.com> <mailto:dannyv@tibco.com>
                 

       
       

                   CC: wsbpel@lists.oasis-open.org
                 

       
       

              Hi Danny,
             
              BPEL so far does not support any technique for modularizing process
              authoring, so the situation you describe is a bit out of scope right
                 

      now.
       

              In any case, my view is that the idea that authors of business process
                 

      are
       

              going to be adding code to deal with things like unsupportedReference
                 

      is
       

              just not realistic. I would even argue that those faults don't
                 

      actually
       

              belong at the BP modeling level and need to be dealt with in a
                 

      different
       

              way.
             
              Dieter's suggestion allows implementations to manage these situations
                 

      in
       

              the best possible way.  This is specially important in the case of
                 

      long
       

              running processes, where months or years of work can be thrown out the
              window when one of these faults is encountered (the current semantics
              require the complete unwinding of the execution stack if the fault is
                 

      not
       

              caught and a generic catch all is essentially good for nothing).
                 

      Typically
       

              you want to allow manual intervention to figure out whether the
                 

      process can
       

              be repaired, terminated if not.
             
              Paco
             
             
             
             
               >From: Danny van der Rijn
               >To:       wsbpel@lists.oasis-open.org
             
               >cc:
             
               >Subject:  Re: [wsbpel] Issue 190 - BPEL Internal Faults (New
                 

      Proposed
       

              Issue Announcement
                      02/03/2005 01:47 PM
             
             
              [Resending this with appropriate header to save Tony/Peter the
                 

      trouble]
       

              -1
             
              As I pointed out in our last face to face, this kind of approach will
                 

      make
       

              any kind of modularization extremely difficult.  It will give no way
                 

      for a
       

              developer of a piece of BPEL code to protect against the "modelling
                 

      error"
       

              (legacy term: "programming error") of another modeller whose attempt
                 

      to
       

              model the real world failed in a tangible instance.
             
              Danny
             
              Tony Fletcher wrote:
                    This issue has been added to the wsbpel issue list with a status
                 

      of
       

                    "received". The status will be changed to "open" if the TC
                 

      accepts it
       

                    as identifying a bug in the spec or decides it should be
                 

      accepted
       

                    specially. Otherwise it will be closed without further
                 

      consideration
       

                    (but will be marked as "Revisitable")
             
             
                    The issues list is posted as a Technical Committee document to
                 

      the
       

                    OASIS WSBPEL TC pages on a regular basis. The current edition,
                 

      as a
       

                    TC document, is the most recent version of the document entitled
                 

      in
       

                    the "Issues" folder of the WSBPEL TC document list - the next
                 

      posting
       

                    as a TC document will include this issue. The list editor's
                 

      working
       

                    copy, which will normally include an issue when it is announced,
                 

      is
       

                    available at this constant URL.
             
             
                    Issue 190: BPEL Internal Faults
                    Status: received
                    Date added: 3 Feb 2005
                    Categories: Fault handling
                    Date submitted: 3 February 2005
                    Submitter: Dieter Koenig1
                    Document: WS-BPEL Working Draft, December, 2004
                    Related Issues: Issue 163 : languageExecutionFault, Issue 169 :
                    Transition condition error handling clarification, and Issue 187
                 

      :
       

                    Legality of Explicitly throwing or rethrowing Standard faults.
                    Description:
                    There are a number of cases in the current spec where the
                 

      behavior of
       

                    a process is described as *undefined*, in particular, after
                    recognizing internal errors described as standard faults.
             
             
                    With the exception of "bpel:joinFailure", *all* of these
                 

      situations
       

                    represent modelling errors that cannot be dealt with by the
                 

      business
       

                    process itself in a meaningful way. This behavior becomes even
                 

      more
       

                    questionable for catchAll handlers that try to deal with
                 

      multiple
       

                    application faults and unexpectedly encounter a standard fault.
             
             
                    Submitter's proposal: Instead of allowing processes to catch
                 

      these as
       

                    standard faults, we propose that the process instance must
                    *terminate* immediately when such a situation is encountered.
             
             
                    The behavior of terminate is well-defined in BPEL -- as far as
                 

      BPEL
       

                    is concerned the instance execution ends when terminate is
                    encountered without any fault handling behavior. Any additional
                    facilities for extended support for, e.g., repair and continue,
                 

      is
       

                    definitely out of scope.
             
             
                    This approach would also create a clear direction for dealing
                 

      with
       

                    any pathological situation within an inlined language (Issue
                 

      163) and
       

                    therefore also for errors within transition conditions (Issue
                 

      169).
       

                    Changes: 3 Feb 2005 - new issue
             
             
                    Best Regards,
                    Tony
             
             
             
              To unsubscribe from this mailing list (and be removed from the roster
                 

      of the
       

              OASIS TC), go to
             
                 

      http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgr
      oup.php.
       
      To unsubscribe from this mailing list (and be removed from the roster of
      the OASIS TC), go to
      http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgr
      oup.php.
     
     
      To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgroup.php.




To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/wsbpel/members/leave_workgroup.php.

Follow-Ups:
- Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed IssueAnnouncement)
  - From: Alex Yiu <alex.yiu@oracle.com>
- Re: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed IssueAnnouncement)
  - From: "Yaron Y. Goland" <ygoland@bea.com>

References:
- RE: [wsbpel] Issue 190 - BPEL Internal Faults (New Proposed Issue Announcement)
  - From: "Satish Thatte" <satisht@microsoft.com>