office message

Subject: Re: [office] Formula: test cases
From: "David A. Wheeler" <dwheeler@dwheeler.com>
To: patrick@durusau.net
Date: Thu, 29 Mar 2007 10:31:26 -0500 (EST)
Patrick Durusau:
> I am deeply uncertain about the test cases for reasons similar to my 
> concerns about the "hidden" text. If the required behavior is 
> sufficiently specified, then the test cases should not be adding 
> anything to the standard. That is to say if I implement the normative 
> text, it should not be possible to obtain a result that is inconsistent 
> with the test cases.

The test cases are a NORMATIVE part of the specification, and have been so from the beginning.  There are a large number of special cases.  Instead of trying to state them all in English, which is both confusing and lengthy, we have stated them as required answers as part of the test cases.

Removing the test cases would be like cutting out random sentences.

> No doubt the test cases are very valuable as in drafting or 
> post-adoption, implementors can test their implementations to insure 
> that implementing the normative text does result in the correct 
> outcome(s), but that really isn't part of the normative text. That is to 
> say that I should not simply implement the test cases and think that I 
> have implemented the normative text. If someone "cheats" in that 
> fashion, they may well encounter an edge case that is not covered by the 
> test cases but is covered by the normative language.

Cheating is countered by the other requirements (there aren't JUST test cases).

But the reverse is actually more of a problem; malicious implementation can occur, but more often the problem is that there's text that LOOKS unambiguous but in fact IS ambiguous.  It's VERY easy to create text that LOOKS like it covers all the cases, but it FAILS to do so.  The only method we've found for ENSURING that people actually agree on the meaning of text is to include test cases that FORCE a particular meaning.

For proof that "obviously clear" text will be later understood with different incompatible meanings, look at any standards group :-).  If an approach keeps not working, perhaps another approach would be sensible.  Especially in this limited domain where it's POSSIBLE to create test cases like this.

> So, test cases are invaluable, but I am leaning towards suggesting that 
> they should not appear in the normative text of the standard.  Actually I 
> am not entirely sure they should even appear in a non-normative annex. 
> In part because if there is any conflict between the normative text and 
> the test cases, which controls? The normative language or the test cases?

I _STRONGLY_ disagree with this idea. It's just like removing random sentences; the test cases are NORMATIVE.

As far as conflicts go, that's no different than any other internal conflict in a specification.  If there are multiple sentences in a specification that conflict, which one controls?  The answer is neither; they need to be adjudicated.

If you must have a rule, then the rule is simple: the TEST CASES control.  Because they are the ones that are automatically checked.  We have no way to automatically check arbitrary English text, nor does anyone else.

> That is one reason why standards strive to only say any rule once and 
> only once. Not entirely possible if you want a readable result but it is 
> something that is a good rule to follow in general. That reduces the 
> grounds for reaching different interpretations.

Yes, I agree.

> Actually I would argue that if I need the test cases to understand the 
> normative text, that is a good sign there is a problem with the 
> normative text.

I understand and respect that viewpoint (and you!).  But I strongly disagree.

The problem is that formulas are rather different beasts from many other specifications; either you get the "right answer" or you don't.  Many protocols and APIs can be very flexible, allowing a variety of inputs and outputs.  Page layouts can vary.  But formulas get one bite at the apple: Given a specific input, they MUST produce a specific output.

English is disturbingly ambiguous, and text that LOOKS like it's really clear turns out to NOT be the case.  Ah, you say, why then there's a problem with the normative text and you just fix it.  Well, you can CHANGE it, but it turns out that you just create DIFFERENT ambiguities.  Again, and again, and again, we've found that including test cases is the difference between "inadequately specified" and "adequately specified."

> I am starting a slow read of the formula proposal this weekend and will 
> post comments on specific sections as I reach them. This and the prior 
> post on notes, rationales, etc., are general comments that I won't 
> repeat as I encounter those aspects of the various sections.
> 
> Hope everyone is having a great day!
> 
> Patrick
> 
> PS: Actually the test cases offer an interesting way to proof the 
> normative text. Remember the "check yourself exercises" in textbooks? 
> Simply cover up the results of the test cases and after reading the 
> normative text, see if you get the same results as are set forth in the 
> test cases. If you don't, it might indicate a problem with the normative 
> text. Will take longer but should result in a very clean normative text.

I encourage you _DO_ that test, and by all means, improve the text with problems you find!

But I believe that is a grossly inadequate way to proof the text; doing that doesn't mean it's actually sufficient without test cases.  Let's say that you get 100% agreement on all cases. So what? The problem is that the next reader/implementor will NOT get the same answer on all the test cases, even if you do.  Time and again, we've crafted "really good English text" only to find that once again, it's possible to misinterpret it.  You'd be shocked at how many "obviously clear" statements for functions turn out to be ambiguous.

The requirement is not, "do YOU get the right answer".  The requirement is that "EVERYONE, AT ALL TIMES, gets the SAME answer."  English text, even when supplemented with mathematical formulas, often LOOKS unambiguous to many reviewers, yet can still have serious ambiguities. We certainly want to improve the text as much as we can... but we also want to have a strategy that ensures that if there's an omission in the text, implementors will ALL get the SAME answer anyway.  The FIRST time. And the way to make that MUCH more likely is to include test cases.

At one time standards bodies routinely created conformance tests.  That's become pretty rare.  One problem is that it's often too costly to create conformance tests after-the-fact from the inadequately defined specifications.  As a result, we have lots of standards with lots of ambiguity, NO useful conformance tests, and lots of interoperability problems.  So we've come up with a different solution: make a conformance test a NORMATIVE PART of the specification.  It eliminates many ambiguities, and greatly encourages INTEROPERABLE implementations.

You still need - and want - good normative text.  On that we agree.  But test cases turn out to be necessary to ensuring that the text is interpreted correctly.

Respectfully,

--- David A. Wheeler
Follow-Ups:
- Re: [office] Formula: test cases
  - From: Patrick Durusau <patrick@durusau.net>
- Re: [office] Formula: test cases
  - From: "Bruce D'Arcus" <bdarcus@gmail.com>
References:
- Formula: test cases
  - From: Patrick Durusau <patrick@durusau.net>