wsbpel message

Subject: Re: [wsbpel] Issue - 157 - Proposal For Vote
From: Yuzo Fujishima <fujishima@bc.jp.nec.com>
To: Alex Yiu <alex.yiu@oracle.com>
Date: Tue, 07 Jun 2005 20:54:10 +0900
Alex,

Thank you for the reply. Please see my response in-line.

Alex Yiu wrote:
> 
> Yuzo,
> 
> Summary:
> I would like to add XSLT transformation feature standardized on top of 
> the existing <copy> construct.
> 
> See more inline comment in *GREEN* :
> 
> Yuzo Fujishima wrote:
> 
>> I think we should identify the points in dispute
>> and discuss each. (I don't dare to define sub issues
>> at this moment, though.)
>> Below is my trial:
>>
>> [P1] How much should we respect the existing assign/copy?
>>
>> My opinion is that we should NOT stick to assign/copy
>> if there is a much better alternative. Assign/copy has
>> serious problems that we haven't yet solved
>> for more than two years. We might just as well throw
>> assign/copy away. (Possibly except for EPR/literal assignment.)
>>
>> A promising new way is the XSLT-based transformation activity.
>> It is drastically simpler and clearer. If we succeed in defining the 
>> transformation activity, there seems to be no reason to keep assign/copy.
>>
> *[AYIU]*: IMHO, it is very clear that the XSLT feature acts a 
> complement, not a replacement to <assign>/<copy>. Quite a number of 
> <assign>/<copy> would be modeled in a clumpsy way and awkward way in XSLT:
> 
> Example #1:
> <assign> <copy> <from> $counter + 1 </from> <to> $counter </to>  </copy> 
> </assign>
> 
> Example #2:
> <assign> <copy>
> <from> 333 </from>
> <to> $var/p:abc/@attr </to> 
> </copy> </assign>

OK, I don't argue that the XSLT approach is simpler here.

#1
<transform variable="counter">
    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
        <xsl:template match="*">
            <xsl:copy>
                <xsl:value-of select=". + 1"/>
            </xsl:copy>
        </xsl:template>
    </xsl:stylesheet>
</counter>

#2
<transform variable="var1">
    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
        <xsl:template match="/var1/q:abc/@attr" xmlns:q="urn:x">
            <xsl:attribute name="attr">333</xsl:attribute>
        </xsl:template>
        <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:template>
    </xsl:stylesheet>
</transform>

A possible improvement would be to allow omitting the stylesheet element
and the "copying" template. Then

#1
<transform variable="var1">
        <xsl:template match="*">
            <xsl:copy>
                <xsl:value-of select=". + 1"/>
            </xsl:copy>
        </xsl:template>
</transform>

#2
<transform variable="var1">
        <xsl:template match="/var1/q:abc/@attr" xmlns:q="urn:x">
            <xsl:attribute name="attr">333</xsl:attribute>
        </xsl:template>
</transform>

> 
> Also, typical XSLT usage pattern is too flexible to enable any static 
> type analysis, when compared with <assign>/<copy> with XPath.
> 
> IMHO,  *NO effort* has been spent to refine the semantics of 
> <assign>/<copy> until very recently during the last 2 yrs. If we did not 
> get diverted in the F2F, we could have agreed on the bulk text for issue 
> 157 already.
> 
> That is exactly what I am going to do - continue to finish the 
> clarification of <copy> semantics .
> 
> 
>> [P2] Will the XSLT-based transformation activity be
>>     unacceptably slow?
>>
>> My opinion is that we should accept the risk of slow
>> execution for the return of simpler and clearer semantics.
>>
>> Additional comments:
>>  * Although slow, the activity will not be much slower than
>>    invoking transformation web services.
>>  * In some cases, one transformation activity can be faster
>>    than a sequence of assign activities that achieves the    same effect.
>>
> 
> *[AYIU]:* "In some cases, one transformation activity can be faster than 
> a sequence of assign activities that achieves the same effect." Any 
> particular examples? or proof?

Suppose each of the values of var1, var2, ..., and var10
 should be assigned to two positions of var0 each.

If assign/copy is used, this would need 10 * 2 = 20  
copy elements, 40 Xpath evaluations (20 for from, 20 for to),
and 20 node replacement/copying.

If xslt is used, this would need 10 XPath evaluations.
Depending on the cost of node replacement/copying vs.
that of node generation, xslt can be faster than assign/copy.

(I admit the above example is rather artificial.)

> 
> Apart from data copy-vs-modification concern (see more below), an XSLT 
> processor traverses every single node in the input document with most of 
> typical XSLT design patterns, while the existing <assign>/<copy> would 
> touch the nodes pointed by the from-spec and to-spec only.  Yet another 
> computation (CPU, I/O, memory) overhead naturally inherited in XSLT,  
> (That's one of the reason on why XQuery is invented to do certain tasks 
> better, when compared with XSLT.)

I do agree that generally XSLT processing is heavier than
XPath processing. (The former includes the latter, any way.)

My point was that the performance of XSLT is still acceptable,
although suboptimal.

> 
> 
>> [P3] XSLT-based transformation activity will replace the
>>     whole node tree of a variable (or its part) with a new     node 
>> tree. Is it semantically acceptable? (As for the     performance 
>> implications, we have P2 above.)
>>
>> My opinion is that it is totally acceptable. Each variable
>> (and its part) is independent from others. Therefore,
>> whether a variable contains
>>  a partially replaced old tree or
>>  a new tree created from scratch
>> should not matter.
>>
> 
> *[AYIU]:*
> Yuzo, have you got a chance to read the XSLT example that I sent out 
> earlier? The one which will fail to execute when the input data is being 
> modified on the fly.
> 
> Yuzo, I would like to clarify ...
> Are you advocating that we should enourage people to implement an XSLT 
> processor that modifies the input data on-the-fly during XSLT processing?

No.

I think the transformation proceeds as follows:

1. At first, variable var1 holds node tree nt1.
2. A stylesheet is applied to nt1 and new node tree nt2
   is created as a result.
3. nt2 is assigned to var1.

> 
> If so,  we are asking BPEL implementation to ignore the computation 
> model specified by XSLT spec and to create one big short-circuit 
> implementation between BPEL and XSLT to enable this "optimization".  I 
> don't think this short-circuit implementation should be sanctioned by 
> the BPEL spec. By replacing <copy> completely with a XSLT feature, it is 
> equivalent of sanctioning such an implementation direction and requiring 
> every BPEL implementation to break the modularity line between BPEL and 
> XSLT and twist its implementation against the XSLT spec to enable this 
> optimization.
> 
> Personally, I have a number of doubts on how feasible this optimization 
> is. If one vendor take the risk to create such an implementation, that 
> vendor would be responsible for its own implementation.We should not 
> force every vendors to implement against the natural model specified by 
> XSLT spec.
> 
> (I would send out another email for the opinion from other W3C experts 
> on this subject.)
> 
> 
>> [P4] Should the XSLT-based transformation activity support
>>     multiple variables as input?
>>
>> My opinion is YES. Supporting multiple input variables
>> will be essential for simpler transformation description and better 
>> performance (cf. P2). Better yet, we already
>> have that as $variable notation.
>>
> 
> *[AYIU]:* This part I would agree with Yuzo. :-)

What a relief!

> It's a more feature rich to have multiple variable bindings in XSLT.
> Let's do it, if everyone loves XSLT and if XSLT is an add-on to <copy>. ;-)
> 
> 
>> [P5] Should the XSLT-based transformation activity support
>>     specifing a subtree of a variable (and its part) as
>>     input and output?
>>     For example,
>>        input =  $src.part1/some/XPath
>>        output = $dst.part2/another
>>
>> My opinion is NO. Supporting this will result in the
>> similar sets of isseus that assign/copy has.
>>
> 
> *[AYIU]:*
> I would disagree with this part.
> Because, even if we do not support specifying a subtree as the output, 
> we already need to define and support the following data copy pattern:
> element-to-element, text-to-element, element-to-text, text-to-text.
> 
> For the source-part, XSLT can produce either XML element or plain text.
> For the destination-part, we have simple type variable (text) and 
> element-based / complex-type-based variable (element).
> 
> Adding specifying a subtree as the output DO NOT add any complication. 
> (Attribute and Text are treated identically in the context of copy).

I don't quite understand.
* What will happen if the XSLT produces an element and the output
  selects an attribute?
* What will hapen if the XSLT produces a text and the output selects
  an element?

> 
> 
>> [P6] Which should be specified for an XSLT-based transformation
>>     activity,
>>          input and output variables (parts), or
>>          only the variable to be transformed?
>>
>> My opinion is that only the variable to be transformed should
>> be specified. Any variables in-scope can be the input to
>> the transformation just by refering them by $ notation within
>> the stylesheet. Therefore, I see no points in specifying
>> the input variables as the activity attribute (or subelement).
>>
> 
> *[AYIU]*: Clarification needed.
> Do you mean the XSLT does not have a root input document? That is not 
> the case in your previous XSLT example, where the root input document is 
> "var1". Even though it is possible to write an XSLT  without relying on 
> any root input document, those XSLT usually are counter-intuitive.

No.
I am not sure about the terminology, but it seems that
"variable to be transformed" = "root input document".

> 
> However, this is a relative smaller level details of XSLT feature I 
> would add.
> 
> 
>> [P7] How a variable should be initialized? If a variable
>>     is totally uninitialized, we could not apply transformation
>>     stylesheet to it.
>>
>> I am not sure about this yet.
>>
>> My tentative opinion is that an uninitialized target variable should be
>> automatically initialized to <bpel:uninitializedVariable/> as the 
>> first execution step of the XSLT-based transformation activity.
>>
> 
> *[AYIU]: *This is another reason why it make me worry on using XSLT to 
> displace <copy> completely. What about simple-typed variable? Do you 
> mean now that we need to write an XSLT to just do the following?
> 
> <assign> <copy> <from>1 </from> <to> $counter </to>  </copy> </assign>

Rewrite would be:
<transform variable="counter">
        <xsl:template match="/">
            <counter>1</counter>
        </xsl:template>
</transform>

I admit that I am not yet sure that we can/should totally remove
assign/copy for EPR and literal assignment cases.

Yuzo Fujishima

NEC Corporation


> 
> 
> 
> Regards,
> Alex Yiu
> 
> 
> 
>> Yuzo
>>
>> NEC Corporation
>>
>>
>> Ugo Corda wrote:
>>
>>>
>>> Hi Alex,
>>>  
>>> I was not suggesting to modify the original source tree while in the 
>>> process of executing the XSLT transform. (As you explain below, that 
>>> could cause infinite loops).
>>>  
>>> What I am suggesting is that, *instead* of executing the whole XSLT 
>>> transform, we take a short cut: we just modify the original source 
>>> tree and we say that it is the new tree created by the XSLT 
>>> transform. If XSLT tries to complain and say "show me the original 
>>> source tree and demonstrate to me that it was not modified", I would 
>>> simply say "sorry, the original source tree got destroyed and all 
>>> that is left is the new tree". In other words, how would XSLT be able 
>>> to distinguish between these two cases:
>>>  
>>> 1- the result tree is a real new tree and the original source tree 
>>> existed for a short time as a tree distinct from the source tree, but 
>>> now the source tree is gone
>>>  
>>> 2- the result tree is actually a modification of the original source 
>>> tree, and that is all that is left
>>>  
>>> If XSLT was allowed to look at the tree only after the assignment 
>>> (i.e. assignment is atomic from the point of view of XSLT in a BPEL 
>>> context), XSLT could not distinguish case 1 from case 2 (sort of the 
>>> Turing test for AI ;-).
>>>  
>>> Ugo
>>>  
>>> -----Original Message-----
>>> *From:* Alex Yiu [mailto:alex.yiu@oracle.com]
>>> *Sent:* Monday, June 06, 2005 4:20 PM
>>> *To:* Ugo Corda
>>> *Cc:* wsbpeltc; Alex Yiu
>>> *Subject:* Re: [wsbpel] Issue - 157 - Proposal For Vote
>>>
>>>
>>>
>>>     Hi, all,
>>>
>>>     The quotation Ugo made is under the section of Section 5. "Template
>>>     Rules" and Section 5.1 "Processing Model":
>>>     -----------------------------------
>>>     A list of source nodes is processed to create a result tree
>>>     fragment. The result tree is constructed by processing a list
>>>     containing just the root node. A list of source nodes is processed
>>>     by appending the result tree structure created by processing each of
>>>     the members of the list in order ...
>>>     Implementations are free to process the source document in any way
>>>     that produces the same result as if it were processed using this
>>>     processing model.
>>>     -----------------------------------
>>>
>>>     The *context* is how an XSLT processor process the source document
>>>     and match and fire the template rules. The optimization allowed may
>>>     include:
>>>
>>>         * indexing of the source document
>>>         * using different data models: DOM, SAX, XPath, ...
>>>         * parallelism of template rule application
>>>
>>>
>>>     Modifying the source document is NOT the same result . That is a big
>>>     semantic change. NOT just an optimization.
>>>
>>>     If we allow modifying an source document, it will introduce "/von
>>>     Neumann/" style computation back to XSLT, which is known to have
>>>     problems with non-procedural languages (e.g. XSLT and XQuery). The
>>>     modification creates a bunch of problems of current XSLT / XQuery
>>>     design do no cater for. E.g. whether to re-fire some template rules
>>>     after the source document is modified.
>>>
>>>     A detailed example:
>>>     -------------------------
>>>         <xsl:template match="foo">
>>>             <xsl:element name="bar">
>>>                 ...
>>>             </xsl:element>
>>>         </xsl:template>
>>>         <xsl:template match="bar">
>>>             <xsl:element name="foo">
>>>                 ...
>>>             </xsl:element>
>>>         </xsl:template>
>>>     -------------------------
>>>
>>>     We have a template rule that transforms the "foo" element into the
>>>     "bar" element. And, we have another rule which transform "bar"
>>>     element into "foo" element.
>>>
>>>     If the source document is NOT modified, its semantics is very clear.
>>>     It is a "flipping"  XSLT : all "foo" elements are flipped  to "bar",
>>>     while all "bar" elements are flipped to "foo".
>>>
>>>     However, if the source document is modified, will we run into an
>>>     infinite loop?
>>>
>>>     Also, allowing modification of source document essentially destroy
>>>     the parallelism of template rule application.
>>>
>>>
>>>
>>>     Regards,
>>>     Alex Yiu
>>>
>>>
>>>
>>>
>>>
>>>
>>>     Ugo Corda wrote:
>>>
>>>>     Hi Alex,
>>>>     
>>>>
>>>>> not having the capabilities of a smaller granularity of
>>>>
>>>>     replacement has a BIG impact on efficiency of <assign> .
>>>>
>>>>> For example: in order to replace a small zip code field of a
>>>>
>>>>     large PO documents (e.g. 100 line items), we would effectively
>>>>     copy all those 100 line items.
>>>>
>>>>> That is NOT an implementation-dependent issue. The XSLT spec
>>>>
>>>>     clearly shows its intention (see the quotations above).        
>>>>       I am not convinced that XSLT actually imposes those limitations
>>>>     on an implementation optimization. For instance, sec. 5.1 of XSLT
>>>>     1.0, Processing Model, states: "Implementations are free to
>>>>     process the source document in any way that produces the same
>>>>     result as if it were processed using this processing model".
>>>>          So, suppose that the large PO document is in my target 
>>>> variable,
>>>>     and I want to replace just a zip code field. Evidently I don't
>>>>     care about preserving the original PO document as a
>>>>     separate independent entity, since all I care is that, after the
>>>>     assign, the variable contains the modified PO. If the original PO
>>>>     is in DOM form, what would prevent my implementation from just
>>>>     replacing the zip code field and then pretending that the modified
>>>>     DOM is actually the DOM representing the whole new PO document
>>>>     resulting from the XSLT transform, *as if* the modified
>>>>     PO actually got generated applying the copy/creation semantics as
>>>>     described in the XSLT process model?
>>>>          Ugo 
>>>
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this mail list, you must leave the OASIS TC that
>> generates this mail.  You may a link to this group and all your TCs in 
>> OASIS
>> at:
>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
> 
>
References:
- RE: [wsbpel] Issue - 157 - Proposal For Vote
  - From: "Ugo Corda" <UCorda@SeeBeyond.com>
- Re: [wsbpel] Issue - 157 - Proposal For Vote
  - From: Yuzo Fujishima <fujishima@bc.jp.nec.com>
- Re: [wsbpel] Issue - 157 - Proposal For Vote
  - From: Alex Yiu <alex.yiu@oracle.com>