wsbpel message

Subject: Re: [wsbpel] Issue - 157 - Proposal For Vote

From: Alex Yiu <alex.yiu@oracle.com>
To: Yuzo Fujishima <fujishima@bc.jp.nec.com>
Date: Tue, 07 Jun 2005 02:05:22 -0700

Yuzo,

Summary:
I would like to add XSLT transformation feature standardized on top of the existing <copy> construct.

See more inline comment in GREEN:

Yuzo Fujishima wrote:

I think we should identify the points in dispute
and discuss each. (I don't dare to define sub issues
at this moment, though.)
Below is my trial:

[P1] How much should we respect the existing assign/copy?

My opinion is that we should NOT stick to assign/copy
if there is a much better alternative. Assign/copy has
serious problems that we haven't yet solved
for more than two years. We might just as well throw
assign/copy away. (Possibly except for EPR/literal assignment.)

A promising new way is the XSLT-based transformation activity.
It is drastically simpler and clearer. If we succeed in defining the transformation activity, there seems to be no reason to keep assign/copy.

[AYIU]: IMHO, it is very clear that the XSLT feature acts a complement, not a replacement to <assign>/<copy>. Quite a number of <assign>/<copy> would be modeled in a clumpsy way and awkward way in XSLT:

Example #1:
<assign> <copy> <from> $counter + 1 </from> <to> $counter </to> </copy> </assign>

Example #2:
<assign> <copy>
<from> 333 </from>
<to> $var/p:abc/@attr </to>
</copy> </assign>

Also, typical XSLT usage pattern is too flexible to enable any static type analysis, when compared with <assign>/<copy> with XPath.

IMHO, NO effort has been spent to refine the semantics of <assign>/<copy> until very recently during the last 2 yrs. If we did not get diverted in the F2F, we could have agreed on the bulk text for issue 157 already.

That is exactly what I am going to do - continue to finish the clarification of <copy> semantics.

[P2] Will the XSLT-based transformation activity be
    unacceptably slow?

My opinion is that we should accept the risk of slow
execution for the return of simpler and clearer semantics.

Additional comments:
* Although slow, the activity will not be much slower than
   invoking transformation web services.
* In some cases, one transformation activity can be faster
   than a sequence of assign activities that achieves the    same effect.

[AYIU]: "In some cases, one transformation activity can be faster than a sequence of assign activities that achieves the same effect." Any particular examples? or proof?

Apart from data copy-vs-modification concern (see more below), an XSLT processor traverses every single node in the input document with most of typical XSLT design patterns, while the existing <assign>/<copy> would touch the nodes pointed by the from-spec and to-spec only. Yet another computation (CPU, I/O, memory) overhead naturally inherited in XSLT, (That's one of the reason on why XQuery is invented to do certain tasks better, when compared with XSLT.)

[P3] XSLT-based transformation activity will replace the
whole node tree of a variable (or its part) with a new node tree. Is it semantically acceptable? (As for the performance implications, we have P2 above.)

My opinion is that it is totally acceptable. Each variable
(and its part) is independent from others. Therefore,
whether a variable contains
a partially replaced old tree or
a new tree created from scratch
should not matter.

[AYIU]:
Yuzo, have you got a chance to read the XSLT example that I sent out earlier? The one which will fail to execute when the input data is being modified on the fly.

Yuzo, I would like to clarify ...
Are you advocating that we should enourage people to implement an XSLT processor that modifies the input data on-the-fly during XSLT processing?

If so, we are asking BPEL implementation to ignore the computation model specified by XSLT spec and to create one big short-circuit implementation between BPEL and XSLT to enable this "optimization". I don't think this short-circuit implementation should be sanctioned by the BPEL spec. By replacing <copy> completely with a XSLT feature, it is equivalent of sanctioning such an implementation direction and requiring every BPEL implementation to break the modularity line between BPEL and XSLT and twist its implementation against the XSLT spec to enable this optimization.

Personally, I have a number of doubts on how feasible this optimization is. If one vendor take the risk to create such an implementation, that vendor would be responsible for its own implementation.We should not force every vendors to implement against the natural model specified by XSLT spec.

(I would send out another email for the opinion from other W3C experts on this subject.)

[P4] Should the XSLT-based transformation activity support
multiple variables as input?

My opinion is YES. Supporting multiple input variables
will be essential for simpler transformation description and better performance (cf. P2). Better yet, we already
have that as $variable notation.

[AYIU]: This part I would agree with Yuzo. :-)
It's a more feature rich to have multiple variable bindings in XSLT.
Let's do it, if everyone loves XSLT and if XSLT is an add-on to <copy>. ;-)

[P5] Should the XSLT-based transformation activity support
    specifing a subtree of a variable (and its part) as
    input and output?
    For example,
       input = $src.part1/some/XPath
       output = $dst.part2/another

My opinion is NO. Supporting this will result in the
similar sets of isseus that assign/copy has.

[AYIU]:
I would disagree with this part.
Because, even if we do not support specifying a subtree as the output, we already need to define and support the following data copy pattern:
element-to-element, text-to-element, element-to-text, text-to-text.

For the source-part, XSLT can produce either XML element or plain text.
For the destination-part, we have simple type variable (text) and element-based / complex-type-based variable (element).

Adding specifying a subtree as the output DO NOT add any complication. (Attribute and Text are treated identically in the context of copy).

[P6] Which should be specified for an XSLT-based transformation
    activity,
         input and output variables (parts), or
         only the variable to be transformed?

My opinion is that only the variable to be transformed should
be specified. Any variables in-scope can be the input to
the transformation just by refering them by $ notation within
the stylesheet. Therefore, I see no points in specifying
the input variables as the activity attribute (or subelement).

[AYIU]: Clarification needed.
Do you mean the XSLT does not have a root input document? That is not the case in your previous XSLT example, where the root input document is "var1". Even though it is possible to write an XSLT without relying on any root input document, those XSLT usually are counter-intuitive.

However, this is a relative smaller level details of XSLT feature I would add.

[P7] How a variable should be initialized? If a variable
is totally uninitialized, we could not apply transformation
stylesheet to it.

I am not sure about this yet.

My tentative opinion is that an uninitialized target variable should be
automatically initialized to <bpel:uninitializedVariable/> as the first execution step of the XSLT-based transformation activity.

[AYIU]: This is another reason why it make me worry on using XSLT to displace <copy> completely. What about simple-typed variable? Do you mean now that we need to write an XSLT to just do the following?

<assign> <copy> <from>1 </from> <to> $counter </to> </copy> </assign>

Regards,
Alex Yiu

Yuzo

NEC Corporation

Ugo Corda wrote:

Hi Alex,

I was not suggesting to modify the original source tree while in the process of executing the XSLT transform. (As you explain below, that could cause infinite loops).

What I am suggesting is that, *instead* of executing the whole XSLT transform, we take a short cut: we just modify the original source tree and we say that it is the new tree created by the XSLT transform. If XSLT tries to complain and say "show me the original source tree and demonstrate to me that it was not modified", I would simply say "sorry, the original source tree got destroyed and all that is left is the new tree". In other words, how would XSLT be able to distinguish between these two cases:

1- the result tree is a real new tree and the original source tree existed for a short time as a tree distinct from the source tree, but now the source tree is gone

2- the result tree is actually a modification of the original source tree, and that is all that is left

If XSLT was allowed to look at the tree only after the assignment (i.e. assignment is atomic from the point of view of XSLT in a BPEL context), XSLT could not distinguish case 1 from case 2 (sort of the Turing test for AI ;-).

Ugo

-----Original Message-----
*From:* Alex Yiu [mailto:alex.yiu@oracle.com]
*Sent:* Monday, June 06, 2005 4:20 PM
*To:* Ugo Corda
*Cc:* wsbpeltc; Alex Yiu
*Subject:* Re: [wsbpel] Issue - 157 - Proposal For Vote

    Hi, all,

    The quotation Ugo made is under the section of Section 5. "Template
    Rules" and Section 5.1 "Processing Model":
    -----------------------------------
    A list of source nodes is processed to create a result tree
    fragment. The result tree is constructed by processing a list
    containing just the root node. A list of source nodes is processed
    by appending the result tree structure created by processing each of
    the members of the list in order ...
    Implementations are free to process the source document in any way
    that produces the same result as if it were processed using this
    processing model.
    -----------------------------------

    The *context* is how an XSLT processor process the source document
    and match and fire the template rules. The optimization allowed may
    include:

        * indexing of the source document
        * using different data models: DOM, SAX, XPath, ...
        * parallelism of template rule application

    Modifying the source document is NOT the same result . That is a big
    semantic change. NOT just an optimization.

    If we allow modifying an source document, it will introduce "/von
    Neumann/" style computation back to XSLT, which is known to have
    problems with non-procedural languages (e.g. XSLT and XQuery). The
    modification creates a bunch of problems of current XSLT / XQuery
    design do no cater for. E.g. whether to re-fire some template rules
    after the source document is modified.

    A detailed example:
    -------------------------
        <xsl:template match="foo">
            <xsl:element name="bar">
                ...
            </xsl:element>
        </xsl:template>
        <xsl:template match="bar">
            <xsl:element name="foo">
                ...
            </xsl:element>
        </xsl:template>
    -------------------------

    We have a template rule that transforms the "foo" element into the
    "bar" element. And, we have another rule which transform "bar"
    element into "foo" element.

    If the source document is NOT modified, its semantics is very clear.
    It is a "flipping" XSLT : all "foo" elements are flipped to "bar",
    while all "bar" elements are flipped to "foo".

    However, if the source document is modified, will we run into an
    infinite loop?

    Also, allowing modification of source document essentially destroy
    the parallelism of template rule application.

    Regards,
    Alex Yiu

    Ugo Corda wrote:

    Hi Alex,

not having the capabilities of a smaller granularity of

    replacement has a BIG impact on efficiency of <assign> .
For example: in order to replace a small zip code field of a

    large PO documents (e.g. 100 line items), we would effectively
    copy all those 100 line items.

That is NOT an implementation-dependent issue. The XSLT spec

    clearly shows its intention (see the quotations above).              I am not convinced that XSLT actually imposes those limitations
    on an implementation optimization. For instance, sec. 5.1 of XSLT
    1.0, Processing Model, states: "Implementations are free to
    process the source document in any way that produces the same
    result as if it were processed using this processing model".
         So, suppose that the large PO document is in my target variable,
    and I want to replace just a zip code field. Evidently I don't
    care about preserving the original PO document as a
    separate independent entity, since all I care is that, after the
    assign, the variable contains the modified PO. If the original PO
    is in DOM form, what would prevent my implementation from just
    replacing the zip code field and then pretending that the modified
    DOM is actually the DOM representing the whole new PO document
    resulting from the XSLT transform, *as if* the modified
    PO actually got generated applying the copy/creation semantics as
    described in the XSLT process model?
         Ugo

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail. You may a link to this group and all your TCs in OASIS
at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php

Follow-Ups:
- Re: [wsbpel] Issue - 157 - Proposal For Vote
  - From: Yuzo Fujishima <fujishima@bc.jp.nec.com>

References:
- RE: [wsbpel] Issue - 157 - Proposal For Vote
  - From: "Ugo Corda" <UCorda@SeeBeyond.com>
- Re: [wsbpel] Issue - 157 - Proposal For Vote
  - From: Yuzo Fujishima <fujishima@bc.jp.nec.com>