xslt-conformance message

Subject: RE: Submission & Review Policies
From: Cris Cooley <ccooley@overdomain.com>
To: "Xslt-Conformance@Lists. Oasis-Open. Org"<xslt-conformance@lists.oasis-open.org>
Date: Wed, 11 Jul 2001 13:52:09 -0700
Resending full text of both policy documents in the body of this email...

* * * *

Submission Policy

Introduction
Since the World Wide Web relies on technological interoperability, the
need arises, both for vendors and for product users, for testing
product conformance to the W3C specifications.  The objective of the
OASIS XSLT/XPath Conformance Committee (Committee) is to develop a test
suite for use in assessing the conformance of vendor-supplied XSLT
processors to the technical specifications contained in the Recommendations
of the W3C (called herein the "Specification").  The full text of this
Submission Policy and its companion, the Review Policy, are available online
at www.oasis-open.org/committees/xslt/ wherever/thiswillbe.html.  The
Committee welcomes submissions of test cases from all vendors and other
interested parties.  Tests will be considered for inclusion in its test
suite (according to the Review
Policy) on a case-by-case basis.  The Committee will not supply any tests
itself, but will work toward thorough
coverage by accumulating tests submitted by others.  The quality and
comprehensiveness of these submitted tests will
determine how robust the test suite will be.

NOTE: Tests submitted to Version 1.0 of the test suite may be rejected if
they do not comply with the following guidelines.


Guidelines for submission

1. Submitters' tests should test only a single requirement in the
Specification.

The Conformance Test Suite version 1.0 is designed so
that failure of a single test identifies non-conformance to
a single requirement in the Specification.  Many of the
guidelines below are designed to enforce this principle.

2. The tests submitted for version 1.0 of the test suite
should be "atomic" instead of "molecular".

An atomic test covers a single specific issue.  Ideally, it
should check just one assertion of the Specification. In a
comprehensive test suite, each testable assertion in the
Specification should be tested independently of each other
assertion, to the extent possible.  If more than one
assertion is tested at a time (a "molecular" test), the
cause of a failure of the test may not easily indicate the
nature of what is specifically wrong.  Failure in an atomic
test will be much easier to identify and to resolve.

Essentially in an atomic test one failure points out one
bug. The converse is not true: one bug may cause failure of
dozens of tests that involve various invocations of the bug
situation. (Example: XT doesn't support xsl:key, so XT will
be judged non-conformant in that respect. Other processors
may implement keys quite well, but have a certain problem
exposed in one case. That case may have to be "molecular"
if several "atoms" must interact in a certain way to expose
the bug.)

3. The tests should target specific language issues, not
"composite issues".

The tests should be aimed at the language features and
versions that are in scope for the Specification.
"Composite issues" are those that would cause parser errors
or that involve other W3C specifications that are out of
scope for the current test suite.

4. The tests should target "must" provisions of the
Specification, not "should" provisions.

The Specification contains some requirements that are
mandatory ("must") and some that are optional ("should").
For the version 1.0 of the test suite, the submitter should
only submit the former.


5. Later versions of the test suite may allow a broader
range of tests.

Version 1.0 of the test suite will only test atomic tests
of specific in-scope language issues. If submitted, the
Committee may not run or include tests involving molecular
issues, composite issues, parser issues, errata or "should"
provisons of the Specification.

6. The Committee reserves the right to exclude any test
submitted.

Please see the Review Policy for a full description on how
the Committee will judge eligibility of a test.  [url]

7. A test should target only explicit processor choices,
not unspecified areas of the Specification.

There are areas of the Specification that do not specify
what a processor needs to do, so it is impossible to test
for what they actually do.  In other areas the processor is
given a choice regarding how it behaves.  The remaining
areas are unconditional required behaviours.

The suite will differentiate test cases based on choices
made by the submitter.  The Reviewers need to know if a
test corresponds to a particular choice made available to
the processor.  (These will be enumerated in the
information included with the catalogue document model).
The completed test suite will test that portion of The
Catalog of Discretion that is deemed "testable" and where a
question or two can clearly elicit the choice may by the
developer.

8. The Submitter must provide a unique test identification
(ID) for use by the Committee.

The scope of uniqueness of the ID is bounded by the
submitter's collection and will be unambiguously modified
across the set of collections.

9. The Committee will create a directory tree of test
cases, based on the Submitter's ID [?right?] where the
first level down the tree separates out the tests by
Submitter.

From this top level directory will descend each Submitter's
test file hierarchy. The test file hierarchy is the
hierarchy of test files submitted by the submitter.  The
presence of a hierarchy assumes that the submitter does not
want to collect all test files in a single subdirectory.

[this #9 combines two bullets...]

10. The Submitter is welcome to arrange its subdirectories
as it wishes.

The Committee will collect all the files and make them
available in the final collection, mirroring the
submitters' subdirectories.

11. The Committee suggests that the Submitter should give
each test a unique identifier as well.

This requirement reinforces that all test cases submitted
must be uniquely identified.

12. The Committee will also assign a unique identifier to
each Submitter.

13. The final test ID will be concatenation of Submitter
and test IDs.

A submitter's submitted test case will be published as the
committee's final test case if not rejected based on the
review policy criteria.

14. The test scope will be identified by the Specification
version and date.

As the W3C Recommendations evolve, a particular test may
not apply to all versions.  The test suite will contain
pointers to the parts of the Specification containing the
pertinent sentences.  When a processor fails a case, a
Submitter will be able to use the citations to find a
sentence in the Specification that the processor violated.

15. The test scope will also be identified by the modified
date of the errata document

W3C Recommendations have associated errata documents that
are published to correct misrepresentations in the text of
the documents.  It is a summary of issues identified and
resolved by the committee.  Multiple errata documents may
be published each with a date.

16. The tests will become public. No royalties will be
associated with their use.

The Committee intends to retain the personal names of
Submitters so they may get public credit for their work.

* * * *

Review policy

Reviewers should refer to the submission guidelines in the Submission
Policy.  [url]  The tests in version 1.0 of the Conformance Test Suite
should fail when the processor is non-conformant with a single "must"
provision (see Submission policy [Guideline #--]) of the Specification in
scope.  All accepted tests are intended to test conformance; when a
processor can fail the test and still be valid, that test should be
excluded.  To the extent possible, Committee Reviewers should remove
interpretive behaviours.  This will result in equal application of review
policy criteria by all involved, thus producing a consistent and quality
work product.


Review Procedures

1. At least two Reviewers check off on each test.  Only the assessment of a
single member is required for the test to be included in the draft release.

2. Ineligible tests (by definition) should be rejected.

Eligibility is the quality by which a candidate test submitted by a
submitter is judged to determine whether it ends up in the test suite as
published by the committee.


3. Eligibility should be judged by the following:

	3.1 The accuracy of the test.
	Accuracy of a test is determined by a judgement by the reviewer whether the
test case actually tests what the submitter states the test
case tests.  The accuracy is the extent to which the test matches the cited
parts of the Specification.  If it does not match, or only partially
matches, the test should be considered inaccurate.

This determination is made by the Reviewer's interpretation of the
Recommendation, and if necessary, the opinion of the Committee as a whole,
and if necessary, the definitive assessment by the W3C Working Group.

	3.2 The scope of the test.
	See the Submission Policy for a definition of Scope.

	3.3 The clarity of the test.
	Clarity of a test is a determination of whether the aspect being tested is
clearly described with the anticipated results acceptably explained.

	3.4 The clarity of aspect of the Specification being tested.
	The Test Suite aims to test parts of the Specification and errata that
aren't vague.

	3.5 Should/shall use in the Specification.
	This is the same as "must" and "should", discussed in the Submission
Policy.  The test must clearly address a requirement in the Specification
that is a "shall" requirement and not a "should"
requirement.

3.6 Determination of whether a test testing a discretionary item.
The Committee has developed a Catalogue of Discretion items, which includes
a listing of all options given to developers of the technology in the
Specification.  See the website for a list of discretionary items. [url]
Not all discretionary items are testable.

      3.7 The atomic/molecular nature of the test
"Atomic" and "molecular" tests are described in the Submission Policy.

4. Judge each eligible test through a process

5. Run each test through multiple processors.
Although there is no reference implementation, the Committee will form
consensus on which of the prominent processors to use.  The baseline is
unanimity of their results, as reduced to infoset-equivalence.

5. Differences between infoset-equivalence of Submitter and Reviewer output
will trigger examination by the Committee.

6. The Committee will then reach consensus opinion to accept the test,
reject the test, or defer deciding on the test while the issue is forwarded
to the W3C for clarification.

A test can be rejected by reviewers even if all prominent processors give
the same result when the test is not a true conformance
test. If reviewers think it's a good test but different reliable XSLT
processors give different results, issue may be Specification verbiage,
processor bugs, or unclear.

There are several possible (non-exclusive) actions:
	6.1 Reject the test and update errata and errata exclusion
	[Question whether this is errata or gray area update??]
	The test would then be excluded from the published collection.  The Test
Suite control file dictating which submitted tests are excluded from the
published collection is updated.
[??Furthermore, issuance of an erratum actually gives us a way to include
the test case, subject to filtering out at the final stage
of "rendering" a suite for a particular processor. Cris:
call me if you want a longer-winded description. ??]

	6.2 Reject the test with advice to go to W3C.
In this case, the submitter thinks the test is accurate and the Committee
agrees the test is not accurate and the Recommendation is clear enough that
we needn't bother the W3C with an interpretation issue.  This requires
consensus [was that absolute or near consensus?].

This scenario begins when the Submitter looks at the Committee report and
sees that a particular case submitted was excluded and writes to ask why.
The Reviewer will respond to explain. The response includes reference to the
W3C's mail alias for questions about the spec.

6.3 The test case is forwarded to W3C for clarification
If the above options do not avail, the Committee can forward the test to the
W3C for clarification.

	6.4 Additionally, the Committee may wish to accommodate external comment
from the community at large.

      6.5 The Committee will publish a consensus opinion of response to
comment with justification from Recommendation (not just precedence of how a
processor has acted).


7. During the testing process, Reviewers will do the following:

	7.1 A Reviewer will report to the list the hierarchy of tests undertaken
for comparison with multiple processors.

	7.2 A tally of tests will be tracked on a visible web page for the
Committee.

	7.3 Reviewers report that all tests in a given hierarchy have been
examined, including a summary of findings of tests not to be included in the
resulting suite.

	7.4 A given hierarchy is not considered complete until reports from at
least two members have been submitted.


8. During the testing process, the Committee will invite public review:

	8.1 An initial suite of a very small set of files will be used to test
procedures and scripts and stylesheets.

	8.2 The Committee will publish draft work periodically, starting with very
small set.

	8.3 The Committee will solicit comments on usability of the product.

	8.4 The Committee will publish a disposition of comments.

	8.5 The Reviewers will continue testing the files until all the hierarchies
are covered.
Follow-Ups:
- RE: Submission & Review Policies
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
References:
- Submission & Review Policies
  - From: Cris Cooley <ccooley@overdomain.com>