office message

Subject: Re: [office] Formula: test cases
From: Patrick Durusau <patrick@durusau.net>
To: dwheeler@dwheeler.com
Date: Thu, 29 Mar 2007 12:31:32 -0400
David,

David A. Wheeler wrote:

>Patrick Durusau:
>  
>
>>I am deeply uncertain about the test cases for reasons similar to my 
>>concerns about the "hidden" text. If the required behavior is 
>>sufficiently specified, then the test cases should not be adding 
>>anything to the standard. That is to say if I implement the normative 
>>text, it should not be possible to obtain a result that is inconsistent 
>>with the test cases.
>>    
>>
>
>The test cases are a NORMATIVE part of the specification, and have been so from the beginning.  There are a large number of special cases.  Instead of trying to state them all in English, which is both confusing and lengthy, we have stated them as required answers as part of the test cases.
>
>Removing the test cases would be like cutting out random sentences.
>
>  
>
Yes, and you will recall that I raised my misgivings about the inclusion 
of test cases with you quite some time ago.

We disagree then and if I understand your "English text" it appears we 
continue to do so. ;-)

BTW, I am reviewing another standards proposal that takes the out "its 
too complicated to explain" approach. ;-) You can imagine my reaction to 
such statements.
I assume you kept a list of cases where the normative text is not 
controlling and that one has to rely on the test cases?

>>No doubt the test cases are very valuable as in drafting or 
>>post-adoption, implementors can test their implementations to insure 
>>that implementing the normative text does result in the correct 
>>outcome(s), but that really isn't part of the normative text. That is to 
>>say that I should not simply implement the test cases and think that I 
>>have implemented the normative text. If someone "cheats" in that 
>>fashion, they may well encounter an edge case that is not covered by the 
>>test cases but is covered by the normative language.
>>    
>>
>
>Cheating is countered by the other requirements (there aren't JUST test cases).
>
>But the reverse is actually more of a problem; malicious implementation can occur, but more often the problem is that there's text that LOOKS unambiguous but in fact IS ambiguous.  It's VERY easy to create text that LOOKS like it covers all the cases, but it FAILS to do so.  The only method we've found for ENSURING that people actually agree on the meaning of text is to include test cases that FORCE a particular meaning.
>
>For proof that "obviously clear" text will be later understood with different incompatible meanings, look at any standards group :-).  If an approach keeps not working, perhaps another approach would be sensible.  Especially in this limited domain where it's POSSIBLE to create test cases like this.
>
>  
>
Err, but test cases remove only some ambiguity as I pointed out in my 
initial post. If you really want to remove *all* ambiguity, then you 
would have to formally prove the results of each formula. What test 
cases do is remove ambiguity that you thought about providing a test 
case for.

That is not the same thing as removing Ambiguity (writ large).

>>So, test cases are invaluable, but I am leaning towards suggesting that 
>>they should not appear in the normative text of the standard.  Actually I 
>>am not entirely sure they should even appear in a non-normative annex. 
>>In part because if there is any conflict between the normative text and 
>>the test cases, which controls? The normative language or the test cases?
>>    
>>
>
>I _STRONGLY_ disagree with this idea. It's just like removing random sentences; the test cases are NORMATIVE.
>
>  
>
Err, it really should not be "like removing random sentences." Isn't 
that just a bit of an exaggeration?

>As far as conflicts go, that's no different than any other internal conflict in a specification.  If there are multiple sentences in a specification that conflict, which one controls?  The answer is neither; they need to be adjudicated.
>
>If you must have a rule, then the rule is simple: the TEST CASES control.  Because they are the ones that are automatically checked.  We have no way to automatically check arbitrary English text, nor does anyone else.
>
>  
>
Err, are you sure you want that rule? That TEST CASES control?

In that case I can write an application that gives the specified results 
for the enumerated test cases and utterly random results for any other 
input.

The problem is that a test case cannot be generalized, well, unless you 
want to enumerate all possible inputs and results, which would be rather 
lengthy for the set of integers.

>>That is one reason why standards strive to only say any rule once and 
>>only once. Not entirely possible if you want a readable result but it is 
>>something that is a good rule to follow in general. That reduces the 
>>grounds for reaching different interpretations.
>>    
>>
>
>Yes, I agree.
>
>  
>
>>Actually I would argue that if I need the test cases to understand the 
>>normative text, that is a good sign there is a problem with the 
>>normative text.
>>    
>>
>
>I understand and respect that viewpoint (and you!).  But I strongly disagree.
>
>The problem is that formulas are rather different beasts from many other specifications; either you get the "right answer" or you don't.  Many protocols and APIs can be very flexible, allowing a variety of inputs and outputs.  Page layouts can vary.  But formulas get one bite at the apple: Given a specific input, they MUST produce a specific output.
>
>English is disturbingly ambiguous, and text that LOOKS like it's really clear turns out to NOT be the case.  Ah, you say, why then there's a problem with the normative text and you just fix it.  Well, you can CHANGE it, but it turns out that you just create DIFFERENT ambiguities.  Again, and again, and again, we've found that including test cases is the difference between "inadequately specified" and "adequately specified."
>
>  
>
Note my comments above on test cases removing ambiguity only for the 
enumerated test.

>>I am starting a slow read of the formula proposal this weekend and will 
>>post comments on specific sections as I reach them. This and the prior 
>>post on notes, rationales, etc., are general comments that I won't 
>>repeat as I encounter those aspects of the various sections.
>>
>>Hope everyone is having a great day!
>>
>>Patrick
>>
>>PS: Actually the test cases offer an interesting way to proof the 
>>normative text. Remember the "check yourself exercises" in textbooks? 
>>Simply cover up the results of the test cases and after reading the 
>>normative text, see if you get the same results as are set forth in the 
>>test cases. If you don't, it might indicate a problem with the normative 
>>text. Will take longer but should result in a very clean normative text.
>>    
>>
>
>I encourage you _DO_ that test, and by all means, improve the text with problems you find!
>
>But I believe that is a grossly inadequate way to proof the text; doing that doesn't mean it's actually sufficient without test cases.  Let's say that you get 100% agreement on all cases. So what? The problem is that the next reader/implementor will NOT get the same answer on all the test cases, even if you do.  Time and again, we've crafted "really good English text" only to find that once again, it's possible to misinterpret it.  You'd be shocked at how many "obviously clear" statements for functions turn out to be ambiguous.
>
>  
>
Oh, I don't know. I think DSSSL, for example, did a fairly good job at 
being clear as to what was meant.

Besides, we aren't talking about the range of human experience with 
standards but this one.

>The requirement is not, "do YOU get the right answer".  The requirement is that "EVERYONE, AT ALL TIMES, gets the SAME answer."  English text, even when supplemented with mathematical formulas, often LOOKS unambiguous to many reviewers, yet can still have serious ambiguities. We certainly want to improve the text as much as we can... but we also want to have a strategy that ensures that if there's an omission in the text, implementors will ALL get the SAME answer anyway.  The FIRST time. And the way to make that MUCH more likely is to include test cases.
>
>At one time standards bodies routinely created conformance tests.  That's become pretty rare.  One problem is that it's often too costly to create conformance tests after-the-fact from the inadequately defined specifications.  As a result, we have lots of standards with lots of ambiguity, NO useful conformance tests, and lots of interoperability problems.  So we've come up with a different solution: make a conformance test a NORMATIVE PART of the specification.  It eliminates many ambiguities, and greatly encourages INTEROPERABLE implementations.
>
>  
>
Sorry, what you have is *not* a conformance test. What you have is a 
conformance to a specified set of test cases and the hope that if the 
application conforms to those it will conform to those not enumerated. 
Those are not the same thing by any means.

>You still need - and want - good normative text.  On that we agree.  But test cases turn out to be necessary to ensuring that the text is interpreted correctly.
>
>  
>
That is an assumption on your part and I readily agree there are any 
number of poorly written standards.

Where I disagree is extending that assumption to this standard. DSSSL 
for example I don't think was considered ambiguous by anyone. Obscure 
perhaps but not ambiguous. ;-)

Note that I never said that test cases aren't useful. The test cases in 
the formula draft can be incredibly useful. The question is where do 
they go?

So, let's at least disagree on what is at issue:

1. How to specify normative behavior? (I would not use test cases.)

2. Where should test cases go? (I would not include test cases in the 
standard.)

Please do not take any of my comments as doubting the usefulness of the 
test cases. I am sure they are very useful but I do disagree on the role 
you want to assign them in this standard.

Hope you are having a great day!

Patrick

>Respectfully,
>
>--- David A. Wheeler
>
>
>
>  
>

-- 
Patrick Durusau
Patrick@Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005

Topic Maps: Human, not artificial, intelligence at work!
Follow-Ups:
- Re: [office] Formula: test cases
  - From: "David A. Wheeler" <dwheeler@dwheeler.com>
References:
- Formula: test cases
  - From: Patrick Durusau <patrick@durusau.net>
- Re: [office] Formula: test cases
  - From: "David A. Wheeler" <dwheeler@dwheeler.com>