tag message

Subject: FW: [tag-comment] On non-passage and non-failure
From: "Durand, Jacques R." <JDurand@us.fujitsu.com>
To: <tag@lists.oasis-open.org>
Date: Wed, 10 Oct 2007 09:47:16 -0700
 Forwarding this to the list, from David M. who is still an observer so
far (with his approval).

-Jacques

-----Original Message-----
From: david_marston@us.ibm.com [mailto:david_marston@us.ibm.com] 
Sent: Wednesday, October 03, 2007 8:58 PM
To: tag-comment@lists.oasis-open.org
Subject: [tag-comment] On non-passage and non-failure

First of all, I want to remind you that test assertions (TAs) will be
very useful for testing situations other than conformance, even if they
are written with conformance in mind. So the notion of "pass" equauting
to "conforms" should not be deeply baked into the structure or other
definitive aspects of TAs.

The idea of "unable to evaluate" pass/fail as a third outcome state
cannot be truly eliminated, it can only be pushed into one corner or
another. As I see it, there are three scenarios in contention:

1. Reducing the test result to an "outcome" can produce at least the
three outcomes pass, fail, or unable to evaluate, if not more.

2. Attempt to have only pass and fail as outcomes, and avoid running any
test where you can tell ahead of time that you would not get a clean
pass/fail result. One situation in which such avoidance would take place
is when the preconditions (antecedents) cannot be set up.

3. Run whatever tests you wish, but then refuse to assess the outcome of
any test where you determine that the antecedents were actually not
obtained. Presumably, every antecedent would itself be the result of
some other test that is closer to atomic. If the test of the antecedent
fails, then all cases that need that antecedent are eliminated from
pass/fail evaluation.

Note that scenarios 2 and 3, when applied to conformance, require that
you have a provably correct list of "all the tests that should have
passed" because you want to ensure that you don't say the IUT conforms
when many tests were eliminated.

I think scenario 2 has many operational difficulties, especially
involving that issue of sequencing test cases. The precondition may be
satisfied in more than one way, so you don't want to say that a
particular test is a precondition, but rather any means of obtaining the
precondition will do.

EXAMPLE:
Suppose that the spec has a normative statement that says, "The plus
operator, when applied to an integer and a float, yields a float as a
result." You devise the test case
2 + 2.0 = 4.0
where the result is the type-test on the sum (4.0), which must obtain
"float" as the type.

There are also some other normative statements that get put to use:
Every integer can be converted to an equal float value (no integers are
unconvertible). A string of digits without a decimal point is
type-annotated as integer. A string of digits with one decimal point is
type-annotated as float. Float+float yields float, etc.

Now suppose that a particular implementation has the flaw that the
sub-expression "2" gets type-annotated as float rather than integer.
None of the other normative statements is violated. When run, the test
case yields 4.0, type-annotated as float. But the pass/fail outcome
should not be evaluated, because the implementation did not really
perform integer+float, which was the point of the test.

If we follow scenario 3, then there should be another test case that
just does a type-test on the expression "2" and this implementation will
fail that case outright. If the original test marked passage of this
simpler test as one of its preconditions, we would know to not evaluate
the result.

Scenario 2 could also be adjusted to fit, but consider that another way
to obtain the integer 2 might be available:
string-length("ab")
and another case, if you remembered to write it, could test
type-test(string-length("ab")+2.0)
expecting "float" as the result. Pretty soon, you get a complex mesh of
test sequencing, when it would be much easier to just run the whole set
of tests and post-process to not evaluate certain tests, or go back to
scenario 1 and designate their outcome as unable to evaluate.
.................David Marston