oiic-formation-discuss message

Subject: Re: [oiic-formation-discuss] More about test case metadata
From: "Dave Pawson" <dave.pawson@gmail.com>
To: oiic-formation-discuss@lists.oasis-open.org
Date: Thu, 31 Jul 2008 08:25:07 +0100
2008/7/31  <david_marston@us.ibm.com>:
>
> As in my last message, I am using the term "test lab" to refer generically
> to any organization that wants to perform conformance testing of ODF,

Understood. Slightly unusual usage.


>
> One of the probable justifications for creating this TC is to enable results
> reported by different labs to be compared.

Or the results from different vendor products to be compared?
The deliverable here would be a standard, regular format for results,
relating back to test identities. E.g. XML schema.

Does anyone else support this deliverable?



 We would like the labs to be
> proud when stating, "We ran the official OASIS ODF test ____ ." The blank
> would be filled in with "suite" if the TC decides to make actual test cases
> a deliverable.

I've heard no objection to the statement that the TC will not deliver software,
if you mean 'software suite' by suite.


If the deliverable is only a set of guidelines, maybe
> "procedure" is a better word for that blank. It could also (by a stretch) be
> called a "suite" if the TC delivers a catalog of test cases that should be
> run, but only provides information about the cases rather than the actual
> materials.

No software, just a test specification. Against which software can be
designed and coded. Equate it with a software requirement specification perhaps


>
> This brings us into the realm of test case metadata.
<snip/>


> This describes one test case. The @scenario gives guidance about how to run
> the query,

The word 'scenario' doesn't do that by itself?

> and the output-file/@compare gives guidance about how to compare
> the actual and reference results.

'XML' doesn't do that by itself David?

This needs a lot of semantics (structure/design) behind it. I'd see
that as a job
for the software design aspect, not specification.

 >For more about what might be in metadata,
> see [2].

From which

(? implies optional)

id
title? (of a test)
purpose
description?
status? E.g. One of: unconfirmed, new, assigned, pending,
	 accepted, rejected, holding
SpecRef
preconditions
inputs
expectedresults
version
contributor
rights
grouping?
seealso?

I'd got most of those. I'm twitchy about grouping, I think I'd rather
leave it to implementers?
Some are obvious, others won't be until implementation time.
I like contributor and status, thanks.
Purpose in our case is clear and explained by the source, the ODF spec.
(I guess I'd argue over whether it is meta, but lets leave that :-)

Test filtering is useful. I can see cases where a tester will say what
the heck is the point
of running drawing tests for this app when it only implements the
writer functions.
I'm sure there will be others. Until the tests are visible the
grouping won't be obvious.
I'm assuming a clear relationship between filtering and grouping.


>
> These XQuery "features" are bundles of functionality, and there are only
> six. What ODF 1.1 seems to have is highly-granular discretionary choice for
> low-level elements.


 Not understood at all.


>  Nevertheless, the test case metadata could be designed
> to account for each individual element whose implementation or
> non-implementation is a distinct choice available to the implementer.

With the looseness of the schema I don't think that's impractical.
As above. I can see filtering being useful, I also think we're too early
to do it now.

 Having
> such flags would allow a test lab to use the test suite to answer, "How well
> does this ODF tool implement those bits that it claims to implement?"

Not IMO. That's interpretation of the test results.


 You
> can have separate flags for the profiles, so that the test suite can be used
> to determine how well the implementation fulfills a given profile. The
> metadata should be designed so that a list of several profiles can be
> provided for each test case.

That's more usable. A delivery of 'sufficient' metadata to enable
1. Grouping of tests for convenience of test running
2. Test filtering for unimplemented/inapplicable  parts
3. If applicable, to aid with testing against profiles.

Or ing flags bitwise would do it, although seperate attributes would
be more readable.

Good one David, I'll add that 'for consideration', i.e. leave it to the TC?

>
> How do we know what the implementer claims to implement? Through an
> Implementation Conformance Statement.

Wrong list. That should be (now) a requirement and part of the
standard at 1.2 from
what I heard.


> The answers needed there relate to
> each individual bit of variability **defined in the spec** as a permissible
> unit of variation.

Yuk. Horribly large table.


 Continuing the XQuery example, if the implementer says
> that they implement the "static typing feature" then they will be subject to
> all the test cases that are flagged as being limited to implementations
> having that feature.

This relates to feature sets etc. You need to put this question back
to the main TC.
We can't answer it.

>
> Dave Pawson writes:
>>I'm really not sure how to address tests a vendor knows will fail
>>(e.g. because they haven't implemented para x.y.z. It needs addressing
>>but I'm unsure how best to do it. A control file mapping spec clauses to
>>implemented features, used to control test run (and impact results) is
>>perhaps an approach.)
>
> That's the same idea, I think. "Implemented features" will come from the
> Implementation Conformance Statement (a.k.a. answers to the questionnaire);

Tail wagging donkey? What motive can a vendor have for answering
hundreds of questions?

The main TC should pick this up when they address compliance.
If they see vendors picking subsets, then they should devise appropriate
subsets (profiles) and group requirements accordingly.



> and the test case metadata is the control file; before running tests, filter
> the metadata file to exclude tests that need not be run against this
> particular implementation, then process the remainder of the metadata into
> an executable script.

? Dream on David. XML to executable script? Someday. Not today.

Profile to test selection yes. XML file would suffice for that.
Further automation is unrealistic with todays issues.


> If a vendor says they don't implement XYZ, they're
> risking a negative view in the marketplace,

It's reality. We must address it. The other option is that a test
is run, a tester is then asked (for instance) to set bold using
an application setting.... which doesn't exist. Subjective test. Not viable.
We must be able to exclude the bold test if a vendor hasn't implemented it.




. If
> they say they implement XYZ, they are subject to all the XYZ tests, which
> may reveal that their implementation is incomplete or wrong. It is up to the
> ODF TC to write the spec so that XYZ is clearly delineated.


If only they would. For example: 3.1 para 1 s1
There is a set of pre-defined metadata elements
   which should be processed and updated by the applications.


X exists, 'should' be processed? What the heck kind of test do we spec for that?
Quite untestable.



Again, the
> vendor cannot decline to implement a feature unless the ODF spec (1) clearly
> isolates it and (2) states that the implementation SHOULD or MAY implement
> it, rather than MUST.

Yes they can. quite simply the app remains non-conformant is X areas.





>
> Returning to the main theme of test case metadata: the <test-case> elements
> can be gathered up into a single XML file to facilitate processing by XSLT.

Done.


> In addition to filtering out test cases that do not apply to the
> Implementation Under Test,

Not so fast. These are test requirements. Not tests.
There must be a level of indirection here. Using the test idents is the easiest
way to link these.


> Dave Pawson questions the <output-file> element and the notion that the TC
> would deliver the reference (correct) result for each test case:
> DM>> Reference outputs that show the correct behavior of a test case
> DP> -1
> DP> Rationale. If we define fixed expected data test cases can be built to
> pass it.
> DP> If we leave the detail up to the implementer then the vendors are left
> guessing as to what the data might be.


> Once again, the test case metadata can help. If a test case can be set up so
> that one or a few properties vary, the metadata can identify the variable(s)
> and their dataypes. From that, run-time instructions can be generated with a
> randomly-chosen value to be assigned (say, the font for a certain bit of
> text) and the reference output, having been delivered with a parameter flag,
> would be processed to have that same randomly-chosen value plugged in for
> the file actually used in the compare portion of the test. (More generally,
> the metadata can specify any pre-processing needed to set up the test case,
> including actions like choosing a font at random.)

-0
I can see it's possible. Is it worth the hassle? Not IMO, especially
where rather than a set of discrete values, we have a linear set of
analogue values.






> There are several other aspects of the metadata, such as the file paths
> shown in the example above, that I have not explained. I'm just trying to
> give a general idea of the range of possibilities, to help those following
> this discussion to decide whether test case metadata ought to be itemized as
> a deliverable of the TC. I hope it also helps you think about the other
> potential deliverables as well.


I'd tend to see it as a 'how', rather than a what. How the tests are
formalized.

I can (and have) fully specified tests in
English
ATLAS + Ada (F22 DoD project)

XML (whether you call it metadata or data is immaterial) is an option
with benefits and ... an overhead.


regards


-- 
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk
References:
- More about test case metadata
  - From: david_marston@us.ibm.com