OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

wsbpel-implement message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [wsbpel-implement] Fault tolerance considerations




Mike,

    Interesting scenario. I have run into similar problems using HTTP in =
a product I worked on a while back. Our solution was to periodically =
send a "100 Continue" response to the client, thus keeping the =
connection alive, and the client happily waiting. It was an okay =
solution for the particular product, but in general it encourages a lot =
of idle network resources to be tied up in open connections. It also =
puts a crimp in the scaling story.

    Doesn't SOAP 1.1 talk about "natural" bindings for the =
request/response MEP, but does not mandate that request/response be =
truly synchronous? (Just a vague recollection; I can't seem to be able =
to raise the w3c site right now...).

    I believe WS-Routing allows specification of a return path, and some =
extra context information, so that one could easily correlate an =
asynchronous response to the originator of the request. WS-Reliability =
and ebXML MS have mechanisms for such message correlation as well. As =
long as BPEL is built on the abstract WSDL message model, it can largely =
ignore binding-specific issues. Of course, interoperability demands that =
we at least consider them! If WS-I BP 1.0 is considered the best bet for =
interoperability for BPEL implementations, then we should give such =
HTTP-related issues extra attention.

Cheers,
-Ron=20

Marin, Mike wrote:


<!--[if !supportEmptyParas]--> <!--[endif]-->

Well, I have the same problem, and you do not need a crash to do that. =
The problem is that BPEL prescribe receive-reply as implementing a =
synchronous WSDL operation, when in practice you cannot enforce it. You =
just need add a wait for a week between the receive and the reply, and =
I'm sure you do not want to keep the connection open for that long.

<!--[if !supportEmptyParas]--> <!--[endif]-->

I opened issue 17 (Asynchronous operations) a while back, but have not =
have time to pursue it. IMHO the receive / reply pair does requires an =
asynchronous WSDL binding (one that does not require the connection to =
remain open). In theory, you could define such a binding, but nobody =
will be able to use it because first is not WS-I compliant, and second =
does not fit most WSDL implementation frameworks.

<!--[if !supportEmptyParas]--> <!--[endif]-->

It may be that WS-Routing provides a solution to this issue by allowing =
a reverse message path for the reply. But, I have not had time to study =
this alternative.

<!--[if !supportEmptyParas]--> <!--[endif]-->

In any case, I'm also interested on see (read) how others are tackling =
this implementation issue....

<!--[if !supportEmptyParas]--> <!--[endif]-->

--

Regards,

Mike Marin

<!--[if !supportEmptyParas]--> <!--[endif]-->

-----Original Message-----
From: Ron Ten-Hove [ mailto:Ronald.Ten-Hove@Sun.COM]
Sent: Tuesday, October 14, 2003 4:25 PM
To: bpel implementation
Subject: [wsbpel-implement] Fault tolerance considerations

<!--[if !supportEmptyParas]--> <!--[endif]-->

Folks,

    I was recently given an interesting question from one of my =
development teams, and I thought it would be of interest to this group, =
since it touches on universal implementation issues.

    The question is based on the following scenario: given a process =
something like this:

<sequence>
  <receive name=3D"rcv" ... />
  <assign  name=3D"as1" ... />
  <invoke  name=3D"inv" ... />
  <assign  name=3D"as2" ... />
  <reply   name=3D"rep" ... />
</sequence>
 =20

The <receive> and <reply> activities are part of a request-response MEP, =
bound to SOAP, so that the request-response is synchronous (uses the =
same connection for request and response).

    Simple enough. But suppose that during execution of an instance of =
the above process, somewhere after the <receive> activity is completed =
but before the <reply> activity  is done, the BPEL engine suffers a =
crash. Since we have the full state persistence, recovery is simple =
enough. We can therefore finish creating the reply, but this is rather =
useless, since the client connection is lost.=20

    So what is the right thing to do under these circumstances? Should =
the engine, upon recovery in this situation, fault the running activity? =
Should it continue to the reply activity, and presumably fault because =
the connection is closed?

    What of the client program? It sees that the HTTP connection closed =
while awaiting a response to the request. It might reasonably resend the =
request (HTTP being what it is). If this is the expected behaviour, =
might it not be appropriate for the BPEL engine offering the service our =
client is using to, upon recover, "roll back" or otherwise compensate =
the completed activities in the sequence (not shown in the process =
above), to the point of the <receive> activity, and restart the receive?

    I know that some of these complexities are the result of using =
unreliable messaging, and you get what you pay for, right? On the other =
hand, this illustrates some interesting states that a BPEL =
implementation might have to deal with, which aren't discussed in the =
specification. At the very least, we have some unspecified faults to =
deal with -- presumably implementation specific.=20

    So what are other implementers doing in this case? Generating a =
fault of one sort of another, or performing more heroic efforts to =
recover from the crash? I'm just interested in general approaches, since =
we don't want to require NDAs here! My development team is busy trying =
to create some recovery mechanisms for the scenario above, based on some =
sort of client/server interaction (client retries being the most likely =
sort). These guys are pretty clever, so I wouldn't doubt that they could =
invent something that, in many cases, actually recover from the crash =
scenario above.=20

    Thoughts? Is anyone else concerned about crash recovery, perhaps =
with different scenarios?

-Ron


------_=_NextPart_001_01C392BD.73D321F6
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<TITLE></TITLE>

<META content=3D"MSHTML 6.00.2800.1264" name=3DGENERATOR></HEAD>
<BODY text=3D#000000 bgColor=3D#ffffff>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2>Ron,</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>From =
the point of=20
view of both SOAP 1.1 and WSDL 1.1 the question of whether we are =
dealing with a=20
synchronous request/response or an asynchronous one is up to the=20
binding.</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>From =
SOAP 1.1, sec.=20
2:</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>"SOAP=20
implementations can be optimized to exploit the unique characteristics =
of=20
particular<BR>network systems. For example, the HTTP binding described =
in=20
section 6 provides for SOAP<BR>response messages to be delivered as HTTP =

responses, using the same connection as the<BR>inbound=20
request".</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>As you =
mention=20
below, a synchronous binding to HTTP is just a way to "optimize" the=20
request/response MEP, and in principle other non-synchronous bindings =
are=20
possible.</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>Things =
are similar=20
from the point of view of WSDL 1.1.&nbsp;Sec. 2.4.2, Request-response =
Operation,=20
says:</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>"Note =
that a=20
request-response operation is an abstract notion; a particular binding =
must be=20
consulted to<BR>determine how the messages are actually sent: within a =
single=20
communication (such as a HTTP<BR>request/response), or as two =
independent=20
communications (such as two HTTP requests)".</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>So, if =
it was just a=20
matter of looking at SOAP and WSDL, I would say that BPEL should or =
should not=20
generate a fault upon resuming based on the particular binding=20
used.</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>But =
BPEL seems to=20
take a different direction when it comes to&nbsp;the receive/reply =
pattern, in=20
the sense that it seems to strongly imply that a receive/reply is only=20
acceptable with a synchronous binding, and that receive/invoke, combined =
with=20
call back interfaces,&nbsp;should instead be used&nbsp;with asynchronous =

bindings. The discussion in BPEL 1.1, page 23, second paragraph, seems =
to be=20
rather clear in this respect.</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial size=3D2>I =
think BPEL takes=20
this direction because of the&nbsp;desire&nbsp;of distinguishing =
synchronous=20
connections from asynchronous ones at the language level, while at the =
same time=20
restraining from specifying any bindings. I guess if WSDL allowed to =
distinguish=20
synchronous/asynchronous BPEL could have used that, but WSDL 1.1 does =
not and=20
BPEL uses receive/reply and receive/invoke, plus the call back=20
interfaces,&nbsp;to distinguish the two cases.</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D539531401-15102003><FONT face=3DArial=20
size=3D2>Ugo</FONT></SPAN></DIV>
<DIV><SPAN class=3D539531401-15102003></SPAN><SPAN=20
class=3D539531401-15102003></SPAN><FONT face=3DTahoma><FONT =
size=3D2><SPAN=20
class=3D539531401-15102003><FONT=20
face=3DArial>&nbsp;</FONT></SPAN></FONT></FONT></DIV>
<DIV><FONT face=3DTahoma><FONT size=3D2><SPAN=20
class=3D539531401-15102003>&nbsp;</SPAN>-----Original =
Message-----<BR><B>From:</B>=20
Ron Ten-Hove [mailto:Ronald.Ten-Hove@Sun.COM]<BR><B>Sent:</B> Tuesday, =
October=20
14, 2003 5:48 PM<BR><B>To:</B> Marin, Mike<BR><B>Cc:</B> bpel=20
implementation<BR><B>Subject:</B> Re: [wsbpel-implement] Fault tolerance =

considerations<BR><BR></DIV></FONT></FONT>
<BLOCKQUOTE dir=3Dltr=20
style=3D"PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px =
solid; MARGIN-RIGHT: 0px">Mike,<BR><BR>&nbsp;&nbsp;&nbsp;=20
  Interesting scenario. I have run into similar problems using HTTP in a =
product=20
  I worked on a while back. Our solution was to periodically send a "100 =

  Continue" response to the client, thus keeping the connection alive, =
and the=20
  client happily waiting. It was an okay solution for the particular =
product,=20
  but in general it encourages a lot of idle network resources to be =
tied up in=20
  open connections. It also puts a crimp in the scaling=20
  story.<BR><BR>&nbsp;&nbsp;&nbsp; Doesn't SOAP 1.1 talk about "natural" =

  bindings for the request/response MEP, but does <B>not</B> mandate =
that=20
  request/response be truly synchronous? (Just a vague recollection; I =
can't=20
  seem to be able to raise the w3c site right =
now...).<BR><BR>&nbsp;&nbsp;&nbsp;=20
  I believe WS-Routing allows specification of a return path, and some =
extra=20
  context information, so that one could easily correlate an =
asynchronous=20
  response to the originator of the request. WS-Reliability and ebXML MS =
have=20
  mechanisms for such message correlation as well. As long as BPEL is =
built on=20
  the abstract WSDL message model, it can largely ignore =
binding-specific=20
  issues. Of course, interoperability demands that we at least consider =
them! If=20
  WS-I BP 1.0 is considered the best bet for interoperability for BPEL=20
  implementations, then we should give such HTTP-related issues extra=20
  attention.<BR><BR>Cheers,<BR>-Ron <BR><BR>Marin, Mike wrote:<BR>
  <BLOCKQUOTE=20
  =
cite=3Dmid69FB2CA668C6D841BE0326C13071C69F0E1B29@hq-ex2kpo1.filenet.fn.co=
m=20
  type=3D"cite">
    <META content=3DWord.Document name=3DProgId>
    <META content=3D"Microsoft Word 9" name=3DGenerator>
    <META content=3D"Microsoft Word 9" name=3DOriginator><LINK=20
    href=3D"cid:filelist.xml@01C39277.7F638D20"; rel=3DFile-List><!--[if =
gte mso 9]><xml>
 <o:OfficeDocumentSettings>
  <o:DoNotRelyOnCSS/>
 </o:OfficeDocumentSettings>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:Zoom>0</w:Zoom>
  <w:DocumentKind>DocumentEmail</w:DocumentKind>
  <w:EnvelopeVis/>
 </w:WordDocument>
</xml><![endif]-->
    <STYLE>@font-face {
	font-family: Tahoma;
}
@page Section1 {size: 8.5in 11.0in; margin: 1.0in 1.25in 1.0in 1.25in; =
mso-header-margin: .5in; mso-footer-margin: .5in; mso-paper-source: 0; }
P.MsoNormal {
	FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: "Times =
New Roman"; mso-style-parent: ""; mso-pagination: widow-orphan; =
mso-fareast-font-family: "Times New Roman"
}
LI.MsoNormal {
	FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: "Times =
New Roman"; mso-style-parent: ""; mso-pagination: widow-orphan; =
mso-fareast-font-family: "Times New Roman"
}
DIV.MsoNormal {
	FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: "Times =
New Roman"; mso-style-parent: ""; mso-pagination: widow-orphan; =
mso-fareast-font-family: "Times New Roman"
}
P.MsoAutoSig {
	FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: "Times =
New Roman"; mso-pagination: widow-orphan; mso-fareast-font-family: =
"Times New Roman"
}
LI.MsoAutoSig {
	FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: "Times =
New Roman"; mso-pagination: widow-orphan; mso-fareast-font-family: =
"Times New Roman"
}
DIV.MsoAutoSig {
	FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: "Times =
New Roman"; mso-pagination: widow-orphan; mso-fareast-font-family: =
"Times New Roman"
}
PRE {
	FONT-SIZE: 10pt; MARGIN: 0in 0in 0pt; COLOR: black; FONT-FAMILY: =
"Courier New"; mso-pagination: widow-orphan; mso-fareast-font-family: =
"Courier New"
}
SPAN.EmailStyle16 {
	COLOR: navy; mso-style-type: personal-reply; mso-ansi-font-size: =
10.0pt; mso-ascii-font-family: Arial; mso-hansi-font-family: Arial; =
mso-bidi-font-family: Arial
}
DIV.Section1 {
	page: Section1
}
</STYLE>
<!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext=3D"edit" spidmax=3D"1027"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext=3D"edit">
  <o:idmap v:ext=3D"edit" data=3D"1"/>
 </o:shapelayout></xml><![endif]-->
    <DIV class=3DSection1>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: =
Arial">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">Well, I =
have the=20
    same problem, and you do not need a crash to do that. The problem is =
that=20
    BPEL prescribe receive-reply as implementing a synchronous WSDL =
operation,=20
    when in practice you cannot enforce it. You just need add a wait for =
a week=20
    between the receive and the reply, and I&#8217;m sure you do not =
want to keep the=20
    connection open for that long.<O:P></O:P></SPAN></FONT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: =
Arial">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">I =
opened issue 17=20
    (Asynchronous operations) a while back, but have not have time to =
pursue it.=20
    IMHO the receive / reply pair does requires an asynchronous WSDL =
binding=20
    (one that does not require the connection to remain open). In =
theory, you=20
    could define such a binding, but nobody will be able to use it =
because first=20
    is not WS-I compliant, and second does not fit most WSDL =
implementation=20
    frameworks.<O:P></O:P></SPAN></FONT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: =
Arial">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">It may =
be that=20
    WS-Routing provides a solution to this issue by allowing a reverse =
message=20
    path for the reply. But, I have not had time to study this=20
    alternative.<O:P></O:P></SPAN></FONT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: =
Arial">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">In any =
case, I&#8217;m=20
    also interested on see (read) how others are tackling this =
implementation=20
    issue&#8230;.<O:P></O:P></SPAN></FONT></SPAN></P>
    <P class=3DMsoNormal><SPAN class=3DEmailStyle16><FONT face=3DArial =
color=3Dnavy=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: =
Arial">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></SPAN></P>
    <P class=3DMsoNormal><!--[if supportFields]><span=20
class=3DEmailStyle16><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial'><span =
style=3D'mso-element:
field-begin'></span><span style=3D"mso-spacerun: =
yes">&nbsp;</span>AUTOTEXTLIST=20
\s &quot;E-mail Signature&quot; <span =
style=3D'mso-element:field-separator'></span></span></font></span><![endi=
f]--><FONT=20
    face=3D"Courier New" size=3D2><SPAN=20
    style=3D"FONT-SIZE: 10pt; FONT-FAMILY: 'Courier =
New'">--</SPAN></FONT><FONT=20
    face=3D"Courier New"><SPAN=20
    style=3D"FONT-FAMILY: 'Courier New'"><O:P></O:P></SPAN></FONT></P>
    <P class=3DMsoNormal><FONT face=3D"Courier New" color=3Dblack =
size=3D3><SPAN=20
    style=3D"FONT-SIZE: 12pt; FONT-FAMILY: 'Courier =
New'">Regards,</SPAN></FONT><FONT=20
    face=3D"Courier New" size=3D2><SPAN=20
    style=3D"FONT-SIZE: 10pt; FONT-FAMILY: 'Courier =
New'"><O:P></O:P></SPAN></FONT></P>
    <P class=3DMsoNormal><FONT face=3D"Courier New" color=3Dblack =
size=3D2><SPAN=20
    style=3D"FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'">Mike=20
    Marin<O:P></O:P></SPAN></FONT></P>
    <P class=3DMsoNormal><!--[if supportFields]><span =
class=3DEmailStyle16><font=20
size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:10.0pt;mso-bidi-font-size:
12.0pt;font-family:Arial'><span =
style=3D'mso-element:field-end'></span></span></font></span><![endif]--><=
SPAN=20
    class=3DEmailStyle16><FONT face=3DArial color=3Dnavy size=3D2><SPAN=20
    style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></SPAN></P>
    <P class=3DMsoNormal style=3D"MARGIN-LEFT: 0.5in"><FONT =
face=3DTahoma color=3Dblack=20
    size=3D2><SPAN style=3D"FONT-SIZE: 10pt; FONT-FAMILY: =
Tahoma">-----Original=20
    Message-----<BR><B><SPAN style=3D"FONT-WEIGHT: =
bold">From:</SPAN></B> Ron=20
    Ten-Hove [<A class=3Dmoz-txt-link-freetext=20
    =
href=3D"mailto:Ronald.Ten-Hove@Sun.COM";>mailto:Ronald.Ten-Hove@Sun.COM</A=
>]<BR><B><SPAN=20
    style=3D"FONT-WEIGHT: bold">Sent:</SPAN></B> Tuesday, October 14, =
2003 4:25=20
    PM<BR><B><SPAN style=3D"FONT-WEIGHT: bold">To:</SPAN></B> bpel=20
    implementation<BR><B><SPAN style=3D"FONT-WEIGHT: =
bold">Subject:</SPAN></B>=20
    [wsbpel-implement] Fault tolerance considerations</SPAN></FONT></P>
    <P class=3DMsoNormal style=3D"MARGIN-LEFT: 0.5in"><FONT =
face=3D"Times New Roman"=20
    color=3Dblack size=3D3><SPAN style=3D"FONT-SIZE: 12pt">&lt;!--[if=20
    =
!supportEmptyParas]--&gt;&nbsp;&lt;!--[endif]--&gt;<O:P></O:P></SPAN></FO=
NT></P>
    <P class=3DMsoNormal style=3D"MARGIN-LEFT: 0.5in"><FONT =
face=3D"Times New Roman"=20
    color=3Dblack size=3D3><SPAN=20
    style=3D"FONT-SIZE: 12pt">Folks,<BR><BR>&nbsp;&nbsp;&nbsp; I was =
recently=20
    given an interesting question from one of my development teams, and =
I=20
    thought it would be of interest to this group, since it touches on =
universal=20
    implementation issues.<BR><BR>&nbsp;&nbsp;&nbsp; The question is =
based on=20
    the following scenario: given a process something like=20
    this:<O:P></O:P></SPAN></FONT></P><PRE style=3D"MARGIN-BOTTOM: 0pt; =
MARGIN-LEFT: 1in; MARGIN-RIGHT: 0.5in"><FONT face=3D"Courier New" =
color=3Dblack size=3D2><SPAN style=3D"FONT-SIZE: =
10pt">&lt;sequence&gt;<O:P></O:P></SPAN></FONT></PRE><PRE =
style=3D"MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 1in; MARGIN-RIGHT: =
0.5in"><FONT face=3D"Courier New" color=3Dblack size=3D2><SPAN =
style=3D"FONT-SIZE: 10pt"><SPAN>&nbsp; </SPAN>&lt;receive name=3D"rcv" =
... /&gt;<O:P></O:P></SPAN></FONT></PRE><PRE style=3D"MARGIN-BOTTOM: =
0pt; MARGIN-LEFT: 1in; MARGIN-RIGHT: 0.5in"><FONT face=3D"Courier New" =
color=3Dblack size=3D2><SPAN style=3D"FONT-SIZE: 10pt"><SPAN>&nbsp; =
</SPAN>&lt;assign<SPAN>&nbsp; </SPAN>name=3D"as1" ... =
/&gt;<O:P></O:P></SPAN></FONT></PRE><PRE style=3D"MARGIN-BOTTOM: 0pt; =
MARGIN-LEFT: 1in; MARGIN-RIGHT: 0.5in"><FONT face=3D"Courier New" =
color=3Dblack size=3D2><SPAN style=3D"FONT-SIZE: 10pt"><SPAN>&nbsp; =
</SPAN>&lt;invoke<SPAN>&nbsp; </SPAN>name=3D"inv" ... =
/&gt;<O:P></O:P></SPAN></FONT></PRE><PRE style=3D"MARGIN-BOTTOM: 0pt; =
MARGIN-LEFT: 1in; MARGIN-RIGHT: 0.5in"><FONT face=3D"Courier New" =
color=3Dblack size=3D2><SPAN style=3D"FONT-SIZE: 10pt"><SPAN>&nbsp; =
</SPAN>&lt;assign<SPAN>&nbsp; </SPAN>name=3D"as2" ... =
/&gt;<O:P></O:P></SPAN></FONT></PRE><PRE style=3D"MARGIN-BOTTOM: 0pt; =
MARGIN-LEFT: 1in; MARGIN-RIGHT: 0.5in"><FONT face=3D"Courier New" =
color=3Dblack size=3D2><SPAN style=3D"FONT-SIZE: 10pt"><SPAN>&nbsp; =
</SPAN>&lt;reply<SPAN>&nbsp;&nbsp; </SPAN>name=3D"rep" ... =
/&gt;<O:P></O:P></SPAN></FONT></PRE><PRE style=3D"MARGIN-BOTTOM: 0pt; =
MARGIN-LEFT: 1in; MARGIN-RIGHT: 0.5in"><FONT face=3D"Courier New" =
color=3Dblack size=3D2><SPAN style=3D"FONT-SIZE: =
10pt">&lt;/sequence&gt;<O:P></O:P></SPAN></FONT></PRE><PRE =
style=3D"MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 1in; MARGIN-RIGHT: =
0.5in"><FONT face=3D"Courier New" color=3Dblack size=3D2><SPAN =
style=3D"FONT-SIZE: 10pt"><SPAN>&nbsp; =
</SPAN><O:P></O:P></SPAN></FONT></PRE>
    <P class=3DMsoNormal style=3D"MARGIN-LEFT: 0.5in"><FONT =
face=3D"Times New Roman"=20
    color=3Dblack size=3D3><SPAN style=3D"FONT-SIZE: 12pt">The =
&lt;receive&gt; and=20
    &lt;reply&gt; activities are part of a request-response MEP, bound =
to SOAP,=20
    so that the request-response is synchronous (uses the same =
connection for=20
    request and response).<BR><BR>&nbsp;&nbsp;&nbsp; Simple enough. But =
suppose=20
    that during execution of an instance of the above process, somewhere =
after=20
    the &lt;receive&gt; activity is completed but before the =
&lt;reply&gt;=20
    activity&nbsp; is done, the BPEL engine suffers a crash. Since we =
have the=20
    full state persistence, recovery is simple enough. We can therefore =
finish=20
    creating the reply, but this is rather useless, since the client =
connection=20
    is lost. <BR><BR>&nbsp;&nbsp;&nbsp; So what is the right thing to do =
under=20
    these circumstances? Should the engine, upon recovery in this =
situation,=20
    fault the running activity? Should it continue to the reply =
activity, and=20
    presumably fault because the connection is =
closed?<BR><BR>&nbsp;&nbsp;&nbsp;=20
    What of the client program? It sees that the HTTP connection closed =
while=20
    awaiting a response to the request. It might reasonably resend the =
request=20
    (HTTP being what it is). If this is the expected behaviour, might it =
not be=20
    appropriate for the BPEL engine offering the service our client is =
using to,=20
    upon recover, "roll back" or otherwise compensate the completed =
activities=20
    in the sequence (not shown in the process above), to the point of =
the=20
    &lt;receive&gt; activity, and restart the =
receive?<BR><BR>&nbsp;&nbsp;&nbsp;=20
    I know that some of these complexities are the result of using =
unreliable=20
    messaging, and you get what you pay for, right? On the other hand, =
this=20
    illustrates some interesting states that a BPEL implementation might =
have to=20
    deal with, which aren't discussed in the specification. At the very =
least,=20
    we have some unspecified faults to deal with -- presumably =
implementation=20
    specific. <BR><BR>&nbsp;&nbsp;&nbsp; So what are other implementers =
doing in=20
    this case? Generating a fault of one sort of another, or performing =
more=20
    heroic efforts to recover from the crash? I'm just interested in =
general=20
    approaches, since we don't want to require NDAs here! My development =
team is=20
    busy trying to create some recovery mechanisms for the scenario =
above, based=20
    on some sort of client/server interaction (client retries being the =
most=20
    likely sort). These guys are pretty clever, so I wouldn't doubt that =
they=20
    could invent something that, in many cases, actually recover from =
the crash=20
    scenario above. <BR><BR>&nbsp;&nbsp;&nbsp; Thoughts? Is anyone else=20
    concerned about crash recovery, perhaps with different=20
    =
scenarios?<BR><BR>-Ron<O:P></O:P></SPAN></FONT></P></DIV></BLOCKQUOTE></B=
LOCKQUOTE></BODY></HTML>

------_=_NextPart_001_01C392BD.73D321F6--


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]