ebxml-iic message

Subject: [ebxml-iic] More COmments on TestCase material...

From: Jacques Durand <JDurand@fsw.fujitsu.com>
To: ebxml-iic@lists.oasis-open.org
Date: Tue, 03 Sep 2002 23:07:43 -0700

Title: IIC Conf Call Monday, August 12th at 10:00AM PT (1:00PM ET)

follow-up on Monica's and Mike's comments.

The treatment of errors occurring during test cases seems the most important issue

to decide... (see at the end)

Jacques

In order to complete Test Case material design:
----------------------------------------------

1. [Matt]: specify CPA subset used. Which format should we pick?
So far we get two candidates: tpaSample.xml? minicpa.xml from Hatem?
(quick comments on tpaSample.xml:
- SyncReplyMode options missing
- what is the distinction between Ack "expected" and "requested"?)

2. [Jeff]: we need to finalize message template data, in particular
- the way we parameterize these templates (XPath?)
- the way we build complete MIME envelopes and their content (either
using again a template approach - restrictive but simple - or some other
doc building.)

3. [Mike, Monica?] mapping of Test Cases to Test Assertions.
Can we really assume that there is always 1 test case for each test assertion?
I am sure that is the case for 98% of them, but it would be prudent to not preclude
the possibility of more than 1 test case for an assertion. A test case is always
more concrete than an assertion, could there be situations where it makes sense to
have two or more tests for a same assertion that we would not split?
My question is in fact: do we really have to decide on this, or can we adopt an
Test Case ID scheme that allow for this if we need it later:
it can be same as current assertion ID (e.g. urn:semreq:id:3), and in case we have 1-to-n,
we can use additional letters: e.g. urn:semreq:id:3a, urn:semreq:id:3b, ... ?
or dot numbering: urn:semreq:id:3.1, urn:semreq:id:3.2...
would that be an issue?

[MIKE] - I think that we can incorporate multiple test cases for one
requirement/assertion. We could allow multiple test cases with the same ID.
The test harness could be designed to traverse
through the test requirement tree and execute ALL test cases having the
same 'base' ID as the requirement.
I favor a 'urn:semreq:id:3:1 urn:semreq:id:3:2 ... 3:3' etc.

[mm1: In all honesty, I think over time we could have M-M (test-case to assertion) so I suggest we
provide for extensibility. This is a discussion item similar to my previous one regarding
aggregation of test cases for a type of scenario or lifecycle of functionality testing.
And, to, we need to allow for granularity of the test assertion and cases.]

<Jacques2>
agree with all above, assuming sharing same ID (Mike) means sharing same "base" ID...
</Jacques2>

4. Test Case table formatting [Mike,...]:
- Test Case ID field: see above remarks on numbering. (by the way, why "semreq"?)

[MIKE] - This URN was used because it matched the name of the to name conformance testing
'semantic requirements'. We could shorten it to "req" as that may be a little clearer to
someone looking at the abstract test list.

<Jacques2> as these are test cases, not reqs anymore, can we use a naming more appropriate?
(the ID # can still be based on same "base" ids as test reqs, of course)
</Jacques2>
`
- "Action Element" field: we could use more intuitive "step names",
e.g. for the sending of message: "SendMessage" instead of "SetMessage".

[MIKE] - The current reasoning for useing "Set" and "Get" is because we are constructing/modifying
templates and payloads. "Send" and "Receive" would not seem to be appropriate verbs to use for
"setting" MIME attributes, message content and payloads attributes.

<Jacques2> My concern is that "SetMessage" is not intuitive at all: its not
clear what is being done with this message after its built... (not obvious
it is sent, in future operations it could be just stored, etc.)
while GetMessage is clearer.
I note that the assembling of message material is specified quite separately anyway
from the action ("XPath" field, and "template" field). So there is no confusion
with the nature of the action itself, which is different from assembling a message.
(How about "PutMessage" ? a reference to put/get)
</Jacques2>

- Also I strongly suggest that we make the "verification" of the test, a separate
and final step. (could be called "Verification").\

[mm1: YES! Separate the test from the verification - this is a key concept for a test
framework and just because there is a test, this does not infer verification.]

[MIKE] - The thing to remember with these "abstract" tests is that they are just that..
and do not represent how we will ultimately code and run the tests.
I agree that the actual verification test should be
"broken out" from other portions of the test case ( such as preconditions verification
and message ID correlation ).
Whether or not it is a seperate "test step" is an implementation detail.
( For example, if the test driver was implemented using Schematron, then all
filtering of a received message could be performed using a single test step...
since Schematron provides for individual reporting of each filter result. )
In the abstract test description, the verification test can/should be seperated from
other portions. I have done that with the attached modified abstract test suite.

<Jacques2> right that the implementation may aggregate some steps - provided that
the end behavior is what is expected (and also, in case it is important to distinguish
whether step A or step B failed, an aggregated implementation should still be
able to distinguish these failures!)
</Jacques2>

- "Party" field: probably not needed, as it is always the TestDriver, as per our
definition of what a "step" is: an event that is always observable in TestDriver.

[mm1: What if in the future the Party is actually observable in the Test Service?]

[MIKE] - Based upon our discussion at the F2F, I removed this field. If, however the
the implementation of the test harness will allow observation/initiation
of other nodes besides the testing party, then I agree with Monica that we should include a Party field.

- "ErrorStatus" field needs revision. See below "Test failures".
- ErrorMessage: for each step is fine.
- "XPath" field: let us use a better name... should be more general ,
like "message expression" or something like that.

[MIKE] - I've addressed "ErrorStatus below", and Changed "XPath Expression"
to "Test Message Expression" for now.

5. XPath / message expressions [Mike, Matt]:
- some XPath expressions are for building message material ("SetMessage" action),
some are for expressing "filters" to select the right message (GetMessage).
It would be good to distinguish them in syntax, e.g. the assignment operator "="
be distinguished from equal operator, like in programming languages (e.g. "==").

[MIKE] - I agree that this makes things clearer ( assignment vs. comparison )
Then again,the use of the SetMessage vs. GetMessage operations can also
delineate between an assignment expression and a comparison expression.

<Jacques2> yes, but I still feel uncomfortable associating too much
the message material expression, and the "transfer" operation we do with the
message, like sending or receiving. It could very well be in the future that
a new operation besides sending/receiving may require both types of expression
(assignt and comparison). Or, that we also enhance "sending" with a pre-condition
in the same step, that does comparison... so its safer to distinguish and
less error prone.
</Jacques2>

- GetMessage steps should not be aggregated with the final Verification
condition: GetMessage only contains filters to select the right message.

[MIKE] - Agreed. I believe that this is an implementation detail. If our test harness can
differentiate between the actual test and test correlations and test preconditions, then it may not
be necessary to break the test into seperate test "steps". It remains to be seen how this will be
implemented.

- For the final step (or Verification): will contain the boolean expression
that defines success (currently merged with the "filter" expression of GetMessage step
in current draft.)

[MIKE] - I have seperated the final verification from the rest of the test case
in the latest version of the abstract test suite.
How it is actually implemented remains to be determined.

[mm1: See question above about where we could have 1-M (test assertion to test case).
Is it not applicable that the verification as a complement to but not part of the test,
we do have this condition - 1 test assertion,
that results in (1) test of the case, and (2) verification of the case?
Either way the verification should be separate.]

[MIKE] - I am not sure that I completely understand Monica's question here.
However, I believe that we can have a
"higher level" evaluation of a test requirement ( assertion ), that can consist
of an aggregation of
test cases that must all be verified in order to "pass/fail" the requirement.
Simply allow a 1-M
relationship between a requirement and test cases, and aggregate their result.

<Jacques2> sounds good: it seems that we may have to distinguish two levels of verification:
(1) verification of each individual test case, where we dont care about 1-M,
each test either passes, fails for conformance, or fails to apply.
(2) a final validation phase, which takes as input (a) the test req doc, (b) the outcome of
all testcases, and makes a synthesis of both. The result is expressed in terms
of whether each test req is verified and how well.
I think ultimately we cannot
make the economy of this final verification (though I would not worry about it now.)
Remember that even if there is 1-1, the test case may not pretend to cover
fully the test req (see "run-time coverage, ebXMLFrameworkspec draft, section 7.4,
where it is OK for a test case to express a coverage full/partial/contingent.)
</Jacques2>

- Use of parameters ($MessageId, etc.): it seems that these parameters need sometimes
to be set to current (e.g. received) material. That is not clear how it is done (see Case id:3)

[MIKE] - It is true that this is not evident how this will be done.
The abstract tests only define in a general sense how the tests will be done..
not the implementation specifics. However, I believe that all of the
parameters defined in the XPath statements can be "set" based on the filtering
of these parameters for an incoming message that also passes the examination of
any message that "passes" the XPath correlation filters.
In such a case, MIMEContentType, MIMEStart... and other parameters would be "Set"
based on the 'current' message that satisfies a message correlation filter.

<Jacques2>
that's what I guessed... in order to be able to do that cleanly, we could do this:
(1) the list of parameters you give initially (top of test suite), will also
be associated with XPath expressions that specify exactly how they will be set
by incoming messages. (so its a kind of default list of assignt statements,
that would allow not to have to specify the full expr.)(note: even for MIME data parameters
we may use some XPath expr, assuming some "mapping" of MIME into a full-XML schema?)
(2) initial test case values than are specified in the CPA (or subset ) that is associated
with this test case, could be also named as a different set of parameters,
e.g. $CPA_TimeToLive, $CPA_id, etc. These are not supposed to change across steps of a test case.
(3) Some other initial values such as MessageId, are more constant through-out the test case
(different from (1)), while also very specific to the test case (not in CPA as (2))
, could simply be directly specify in the message expression:
E.g. urn:semreq:id:2 : the expression associated with the SetMessage operation
/SOAP:Envelope/SOAP:Header/eb:MessageHeader/eb:MessageData/eb:MessageId=�$MessageId�
would say directly:
/SOAP:Envelope/SOAP:Header/eb:MessageHeader/eb:MessageData/eb:MessageId=�123456�
(unless we also plan for a set of initial constant identifiers here? )
</Jacques2>

We face two issues:
(a) how to "remember" message material from past test steps?
We could use XPath-based assignment, e.g. a GetMessage could contain filter
expressions as well as assignment expressions: e.g. $MessageId = <xpath expr>

[MIKE] - We need to decide what we need to "remember". Do we really need anything
more than RefToMessageId, ConversationId and CPAId for correlating messsages? I think that
we need some examples here.

(b) Across several steps, as several message are involved, and we may want to
refer material from more than 1 step, we can use step# to identify the parameter:
$1MessageId, $2MessageId...

[MIKE] - I see what you are saying. That is certainly an option, if it is necessary.

<Jacques2>
it may not be necessary for now... we'll advise in case.
We can also "remember" past step material by using explicit assignements to
user-defined variables: $v1 = $MessageId
(we should not need that often, but need to agree on how we would do it if needed)
</Jacques2>

- advanced verification conditions: sometimes verification conditions need more
than just constraints on message material: e.g. check that step N completed
within 10sec from Step M. It seems anyway we need to set a timeout for step completion.
What else? How to improve script language for this? When it comes to checking
that we got say 3 messages of a kind, e.g. for retries in reliability,
could that be an enhancement of the GetMessage step? (where we would specify how
many messages of this kind need be received for the step to complete?)

[MIKE] - I believe that we can script these tests by expanding our <SetMessage> and
<GetMessage> elements to include additional attributes that run our Test Driver. Including
additional attributes to define our test cases should allow us to provide both additional filters
outside of and parameters to XPath expressions.

[mm1: Regardless of what path is chosen, keep in mind - extensibility and discrete
definition. If we continue to embed important data elements in the expressions,
our capability to identify it (should it need to be searched
for and found) may be more difficult. I keep thinking about the database
days when we concatenated data and then
expected to search for an text string, augh.]

<Jacques2>
I think it does not have to be deeply embedded...
A few predefined expression attributes may do.
For example, a very complete set of message expressions associated with GetMessage can be
the following four statements:
<XPath 1> = $XYZ1;
<XPath 2> = $XYZ2;
Test:SelectedNumber = 3; (meaning we expect 3 messages satisfying the above XPath filters,
1 being default)
Test:MaxTime = $CPA_xyz; (meaning we expect this step to be complete within time $CPA_xyz after
end of previous step, or else it fails)
I think the idea is that, since we somehow define a test case "script" language,
we must be able to describe all test case content (and conditions) with it...
without pretending that it is the best "executable" format,
at least it is better than an English comment.
</Jacques2>

[mm1: ON advanced verification conditions, I believe we are seeing we not only
have conditions on the assertion but pre-
and post-conditions on the test case itself as well as the verification phase
(not really metadata though).
Can we attach those conditions to the test case itself, where the conditions
are "included" with the test case?]

<Jacques2>
not sure what would these be.
Something we may need, that I hinted at in next section, is define a test case
condition for "pass",
but also have "fail" conditions - if needed - that would help distinguish
if its a conformance failure, or a test "pre-condition" failure.
(or maybe "pre-condition" failures always show up in intermediate steps?)
</Jacques2>

6. Test Verification , and Test failures [Mike, Matt]:
- A separate step for this would be good, as mentioned above.

[MIKE] - I agree, if test harness implementation requires a separate test step
for verification, then we should do this.

- sometimes a successful test needs to verify that no error message wwas received,
in addition to completing all its steps. How do we do that? Should we define
"Exception step(s)" to a test case, that will capture messages that should NOT occur...
and then when completed, generate test failure?

[mm1: We have discussed exceptions at length in BCP, and exceptions may not
always be errors - they may be less
traveled paths. So, to err on the side of future function, I would suggest
you allow for both exception steps and outcome (test failure or test incomplete? -
do we see these as different? - links with Jacques question below).

[MIKE] - If the specification is specific about what errors SHOULD NOT OCCUR,
then we should feel confident in writing tests to verify this.
If the spec specifically says that NO ERROR MESSAGES SHOULD BE GENERATED,
then I would follow its literal interpretation and design a test accordingly.
Exceptions can be handled as "informational", and flagged accordingly..
but they should not fail a test if the specification does not
indicate that they are illegal, and all other requirements for that test have been met.

<Jacques2>
so do I interpret both your comments (Mike, Monica) as:
- in case NO ERROR MESSAGES SHOULD BE GENERATED, we could define an
"exception" step. (would be labelled "exception" instead of step #, could be inserted
where we would expect the exception to occur, e.g. between step N and step N+1.)
- asssociated with this exception, what to do:
could be a "continue" (there will just be a notification),
or an "error" (of either kind)
- BTW, shouldn't we always "try to catch" error messages even when the spec is
not explicit that these should NOT occur? E.g. for most test cases, it is implicity that
if they go well, no error should be generated. But even if the execution shows
expected result, if the MSH generates an error message in addition, that is NOT OK...
At the very least, we should report that as a "defect" that formally may not fail conformance,
but clearly would prevent normal business usage. So exceptions could be used just for
notifications as well.
(in fact, I still believe this is the kind of things a "conformance clause" is useful at...
Specifying that a condition for conformance is that errors should *only* be generated
in situations where it is explicitly mentioned... or else it is a breach of conformance.
We could make such statement explicit in our test suite.)
</Jacques2>

- important to distinguish two types of failure for a Test Case:
(a) "operation" failure, resulting from the impossibility to carry out the test properly.
e.g. some test step could not complete, for some reason unrelated to the spec requirements
that we are trying to test.
Typically, this happens when the Test Requirement "pre-condition" cannot be realized.
In such case, the conformance report should NOT conclude that MSH implementation
is not conforming, just that the test could not be performed.

[MIKE] - I added an error status of 'FatalStep' to flag these particular test steps.
It applies to any test step ( other than a conformance test or a special 'pre-condition evaluation step' )
I believe, based on the way the first 70 tests are designed, that we can set up a sequential evaluation
of test steps, and exit gracefully from a test case whenever an error occurs at any test step. I have not
seen any test cases yet where we would wish continue to proceed with the test case
upon encountering either an operational failure, pre-condition or conformance test step failure.

(b) "conformance" failure, clearly showing that the spec requirement is not satisfied
by the MSH implementation.

[MIKE] - I named the error status 'FatalTest' in the attached abstract test suite list
for this kind of test step.

Generally, the failures (a) correspond to some step that could not be completed.
So we could associate with each step either type of error: (1) failure causing
"operation" failure, (2) failure causing "conformance" failure.
- should we also make room for a "failure" expression in the verification step?
In other words, in case the "success" expression is not satisfied, we may
still need to distinguish the kind of test failure. A specific error message
could be associated with each kind.

[mm1: May be 3 types of 'failures' - not certain if failures is the best word -
(1) system - outside of the test condition that affects the test,
(2) operation - condition completion to test the assertion, and
(3) conformance failure - affecting testing test assertion itself as defined in a test case(s).
Can we differentiate a system from an operation failure? How will we be able to differentiate
which failure type occurred? Some of it may be outside of the visibility or scope
of the test framework - does this require data from the testing node we interact with?]

[MIKE] - I do not think that we can differentiate between system failure and
an operational failure, particularly
in "correlation test steps", where we are trying to match up received messages
with sent ones. It would be hard to tell
if a message simply was not sent, or didn't correlate
( i.e.. mismatched refToMessageIds/ConversationIds ). I'm submitting an attachment
with 3 proposed "error status" classifications:
-FatalStep - a step operation that failed due to either message examination
( particularly in message "correlation" test steps ) or dueto system failure
-FatalPrecondition - a step operation that failed due to either system failure
(unlikely) or because an optional feature was not present in candidate MSH -
THIS ERROR CONDITION DOES NOT RESULT IN A CONFORMANCE TEST CASE FAILURE
-FatalTest - a step operation that failed due to either system problems
(unlikely) or because of conformance problems with candidate MSH
*** Since the message correlation step should always come FIRST in the
test step sequence for evaluating received messages, all other errors that follow
for a received message should not be a "FatalSystem" nature, but either a precondition
or a conformance type of error.

<Jacques2>
Each one of you seems to have a point... Let us take an example:
A test req says that "if message has no payload, it is OK if SOAP *without* attachment is used".
Here, we don't control the MSH ability to generate such plain SOAP messages (it could still be
SOAP with attachemtn in a multi-part MIME envelope, with no payload.), we just check
that if it does, the message is a well-formed simple SOAP message.
So we have a step 1 "SetMessage" that will trigger a response with no payload,
and a step 2 "GetMessage" that will expect such "plain SOAP" message,
then possibly a step 3 , a "verification" final that checks pure SOAP well-formed.
- If we agree that "SetMessage" steps may fail for some reason and that we'll be able
to catch that, (the test driver should, as HTTP sending may fail) then I'd say
this is a system failure - technical problems in test harness or out, prevented test case to proceed.
It is still a "FatalPrecondition" as Mike said (defined broadly as failure to realize the
preconditions of the test for whatever reason), but if we can it is important to figure out
that is the system, or the MSH implementation features. If its system, it is worth trying again
later... so could be "FatalPrecondition.system"
- If the "GetMessage" step fails: could be (a) we did not get any message before timeout,
(b) we got SOAP message, but it was not of the kind we expected for this test(though ebXML compliant).
Here we cannot really tell why (a) failed (system? MSH?).
For (b), it is clearly a "failure to apply": the test can't apply, and this test req
will not be covered.
But for sake of simplicity, at least in this first version of test framework, I would not try
to distinguish various failure outcomes for a single step: would make it here a "failure to apply",
("FatalPrecondition.notapplicable").
NOTES:
- I am not sure we can distinguish FatalStep from FatalPrecondition: an optional feature not showing up
may manifest as a failure to correlate...
- FatalTest may be generated by some intermediate step(s) as well as (always) by
the final verification step.
So I would have 3 possible failure outcomes for any step:
- FatalPrecondition.system (for cases where its clear it is the system: a failed sending, or
test operation fails internally e.g. test case material not found where it should have)
- FatalPrecondition.notapplicable (with Mike's definitions for both FatalPrecondition and FatalStep)
- FatalTest (with Mike's definition)
Opinion?
</Jacques2>

Follow-Ups:
- Re: [ebxml-iic] More COmments on TestCase material...
  - From: Michael Kass <michael.kass@nist.gov>
- Re: [ebxml-iic] More COmments on TestCase material...
  - From: Michael Kass <michael.kass@nist.gov>