xslt-conformance message

Subject: RE: Submission & Review Policies
From: Cris Cooley <ccooley@overdomain.com>
To: xslt-conformance@lists.oasis-open.org
Date: Fri, 20 Jul 2001 09:57:43 -0700
> > = Cris
> = Ken
  = Cris

> -----Original Message-----
> From: G. Ken Holman [mailto:gkholman@CraneSoftwrights.com]
> Sent: Wednesday, July 18, 2001 9:02 AM
> To: xslt-conformance@lists.oasis-open.org
> Subject: RE: Submission & Review Policies
>
>
> Thank you, Cris, for your work on this.

You are welcome.

>
> At 01/07/11 13:52 -0700, Cris Cooley wrote:
> >Submission Policy
> >
> >Introduction
> >Since the World Wide Web relies on technological interoperability, the
> >need arises, both for vendors and for product users, for testing
> >product conformance to the W3C specifications.  The objective of the
> >OASIS XSLT/XPath Conformance Committee (Committee) is to develop a test
> >suite for use in assessing the conformance of vendor-supplied
>
> I don't think we need "vendor-supplied".

I'll remove it.

>
> >XSLT
> >processors to the technical specifications contained in the
> Recommendations
> >of the W3C (called herein the "Specification").
>
> Should the parenthetical "Committee" up above also be in quotes?

Yes, I'll change it.

>
> >The full text of this
> >Submission Policy and its companion, the Review Policy, are
> available online
> >at www.oasis-open.org/committees/xslt/ wherever/thiswillbe.html.
>
> :{)}
>
> >The
> >Committee welcomes submissions of test cases from all vendors and other
> >interested parties.  Tests will be considered for inclusion in its test
> >suite (according to the Review
> >Policy) on a case-by-case basis.  The Committee will not supply any tests
> >itself,
>
> Actually, we may indeed come up with a few ourselves ... I
> wouldn't rule it
> out here.

Suggest a change to: "The Committee will work toward thorough coverage by
accumulating tests submitted by others."  That does not preclude Committee
submissions and conveys that Submitter input is important.

>
> >but will work toward thorough
> >coverage by accumulating tests submitted by others.  The quality and
> >comprehensiveness of these submitted tests will
> >determine how robust the test suite will be.
> >
> >NOTE: Tests submitted to Version 1.0 of the test suite may be rejected if
> >they do not comply with the following guidelines.
> >
> >
> >Guidelines for submission
> >
> >1. Submitters' tests should test only a single
>
> "citable"

Okay.

>
> >requirement in the
> >Specification.
> >
> >The Conformance Test Suite version 1.0 is designed so
> >that failure of a single test identifies non-conformance to
> >a single requirement in the Specification.
>
> "Recommendation citations are in the form of XPath expressions to
> testable
> statements in the XML working group source documents producing
> the HTML W3C
> documents."

Okay.

>
> >Many of the
> >guidelines below are designed to enforce this principle.
> >
> >2. The tests submitted for version 1.0 of the test suite
> >should be "atomic" instead of "molecular".
> >
> >An atomic test covers a single specific issue.  Ideally, it
> >should check just one assertion of the Specification. In a
> >comprehensive test suite, each testable assertion in the
> >Specification should be tested independently of each other
> >assertion, to the extent possible.  If more than one
> >assertion is tested at a time (a "molecular" test), the
> >cause of a failure of the test may not easily indicate the
> >nature of what is specifically wrong.  Failure in an atomic
> >test will be much easier to identify and to resolve.
> >
> >Essentially in an atomic test one failure points out one
> >bug.
>
> Instead of "bug" perhaps "problem" or "failure to conform" or something
> better ... I wouldn't want to necessarily claim it is a bug (and it is a
> bit derogatory) if it were a design decision (though not very
> wise) on the
> implementation's part.
>

How about "non-conformance"?

> >The converse is not true: one bug may cause failure of
> >dozens of tests that involve various invocations of the bug
> >situation. (Example: XT doesn't support xsl:key, so XT will
> >be judged non-conformant in that respect. Other processors
> >may implement keys quite well, but have a certain problem
> >exposed in one case. That case may have to be "molecular"
> >if several "atoms" must interact in a certain way to expose
> >the bug.)
>
> Well said!

I'll have to revise David's well said example using the new terminology.

>
> >3. The tests should target specific language issues, not
> >"composite issues".
> >
> >The tests should be aimed at the language features and
> >versions that are in scope for the Specification.
> >"Composite issues" are those that would cause parser errors
> >or that involve other W3C specifications that are out of
> >scope for the current test suite.
>
> Is this perhaps a bit redundant?  All of 1, 2 and 3 seem to be repeating
> the same principle.

I'll respond to this in a separate email to include David's 2 comments on
this.

>
> >4. The tests should target "must" provisions of the
> >Specification, not "should" provisions.
> >
> >The Specification contains some requirements that are
> >mandatory ("must") and some that are optional ("should").
> >For the version 1.0 of the test suite, the submitter should
> >only submit the former.
>
> "as the latter is subject to an implementation's discretion".

Okay.

>
> Should we necessarily prevent them from being submitted, or only
> claim here
> that such submitted tests may not end up in the final result.

I'll revise the sentence, which was somewhat whimsical [shouldn't send
should, should send must] to: "Although the Submitter is welcome to send
them, Version 1.0 of the test suite may not include tests of 'should'
provisions."

>
> I think we should encourage as many tests as possible and then ascertain
> their applicability through our review policies.
>
> >5. Later versions of the test suite may allow a broader
> >range of tests.
> >
> >Version 1.0 of the test suite will only test atomic tests
> >of specific in-scope language issues. If submitted, the
> >Committee may not run or include tests involving molecular
> >issues, composite issues, parser issues, errata or "should"
> >provisons of the Specification.

Change to: "Version 1.0 of the test suite will focus on atomic tests..."

> >
> >6. The Committee reserves the right to exclude any test
> >submitted.
> >
> >Please see the Review Policy for a full description on how
> >the Committee will judge eligibility of a test.  [url]
> >
> >7. A test should target only explicit processor choices,
> >not unspecified areas of the Specification.

Change to: "7. A test should target explicit ..."

> >
> >There are areas of the Specification that do not specify
> >what a processor needs to do, so it is impossible to test
> >for what they actually do.  In other areas the processor is
> >given a choice regarding how it behaves.  The remaining
> >areas are unconditional required behaviours.
> >
> >The suite will differentiate test cases based on choices
> >made by the submitter.  The Reviewers need to know if a
> >test corresponds to a particular choice made available to
> >the processor.  (These will be enumerated in the
> >information included with the catalogue document model).
> >The completed test suite will test that portion of The
> >Catalog of Discretion that is deemed "testable" and where a
> >question or two can clearly elicit the choice may by the
> >developer.
> >
> >8. The Submitter must provide a unique test identification
> >(ID) for use by the Committee.
>
> Actually, we will set that ourselves.
>
> The submitter is welcome to send any collection of files in any
> subdirectory structure.  The root subdirectories of the submitter's
> directory structure will be subdirectories of the committee's collection.

I'm unclear, due to your comment under #9, whether the statement in #8 is
accurate, or if it should be replaced by your comment above.  Perhaps this
is not resolved?  I'll leave #8 as is for now.

>
> >The scope of uniqueness of the ID is bounded by the
> >submitter's collection and will be unambiguously modified
> >across the set of collections.
>
> I don't think the above needs to be said.

I'll leave it for now, since it isn't clear to me whether your comment under
#9 applies to the statement above.

>
> >9. The Committee will create a directory tree of test
> >cases, based on the Submitter's ID [?right?] where the
> >first level down the tree separates out the tests by
> >Submitter.

I had no comments that #9 is incorrect, so I'll remove the square bracket
question.

> >
> > From this top level directory will descend each Submitter's
> >test file hierarchy. The test file hierarchy is the
> >hierarchy of test files submitted by the submitter.  The
> >presence of a hierarchy assumes that the submitter does not
> >want to collect all test files in a single subdirectory.
> >
> >[this #9 combines two bullets...]
>
> My comments about #8 can be ignored.

See questions under #8.

>
> >10. The Submitter is welcome to arrange its subdirectories
> >as it wishes.
> >
> >The Committee will collect all the files and make them
> >available in the final collection, mirroring the
> >submitters' subdirectories.
> >
> >11. The Committee suggests that the Submitter should give
> >each test a unique identifier as well.
> >
> >This requirement reinforces that all test cases submitted
> >must be uniquely identified.
> >
> >12. The Committee will also assign a unique identifier to
> >each Submitter.
>
> Already stated in #8 and I think not applicable.

I'll remove #12.

>
> >13. The final test ID will be concatenation of Submitter
> >and test IDs.
>
> I don't think this is necessary from the point of view of a submission
> policy document.

I'll remove it.

>
> >A submitter's submitted test case will be published as the
> >committee's final test case if not rejected based on the
> >review policy criteria.
> >
> >14. The test scope will be identified by the Specification
> >version and date.
> >
> >As the W3C Recommendations evolve, a particular test may
> >not apply to all versions.  The test suite will contain
> >pointers to the parts of the Specification containing the
> >pertinent sentences.  When a processor fails a case, a
> >Submitter will be able to use the citations to find a
> >sentence in the Specification that the processor violated.
> >
> >15. The test scope will also be identified by the modified
> >date of the errata document
> >
> >W3C Recommendations have associated errata documents that
> >are published to correct misrepresentations in the text of
> >the documents.  It is a summary of issues identified and
> >resolved by the committee.  Multiple errata documents may
> >be published each with a date.
> >
> >16. The tests will become public. No royalties will be
> >associated with their use.
> >
> >The Committee intends to retain the personal names of
> >Submitters so they may get public credit for their work.
>
> I see you have a combination of policy issues and logistics.  Should we
> split the above into a short set of immutable policies and a
> longer set of
> pliable logistics that we figure out over time?

I agree that we have logistics and policies mixed.  From a submitter's point
of view it would be easier to have one paper saying 'here's what you should
and shouldn't do' (the Submission Policy) and another saying 'here are the
instructions for submission' (a User's Manual?).


>
> >* * * *
> >
> >Review policy
> >
> >Reviewers should refer to the submission guidelines in the Submission
> >Policy.  [url]  The tests in version 1.0 of the Conformance Test Suite
> >should fail when the processor is non-conformant with a single "must"
> >provision (see Submission policy [Guideline #--]) of the Specification in
> >scope.  All accepted tests are intended to test conformance; when a
> >processor can fail the test and still be valid
>
> perhaps say "fail the test and still produce the anticipated result"

Okay.

>
> >, that test should be
> >excluded.  To the extent possible, Committee Reviewers should remove
>
> "tests exhibiting"
>

Okay.

> >interpretive behaviours.  This will result in equal application of review
> >policy criteria by all involved, thus producing a consistent and quality
> >work product.
> >
> >
> >Review Procedures
> >
> >1. At least two Reviewers check off on each test.  Only the
> assessment of a
> >single member is required for the test to be included in the
> draft release.
> >
> >2. Ineligible tests (by definition) should be rejected.
> >
> >Eligibility is the quality by which a candidate test submitted by a
> >submitter is judged to determine whether it ends up in the test suite as
> >published by the committee.
> >
> >
> >3. Eligibility should be judged by the following:
> >
> >         3.1 The accuracy of the test.
> >         Accuracy of a test is determined by a judgement by the reviewer
> > whether the
> >test case actually tests what the submitter states the test
> >case tests.  The accuracy is the extent to which the test
> matches the cited
> >parts of the Specification.  If it does not match, or only partially
> >matches, the test should be considered inaccurate.
> >
> >This determination is made by the Reviewer's interpretation of the
> >Recommendation, and if necessary, the opinion of the Committee
> as a whole,
> >and if necessary, the definitive assessment by the W3C Working Group.
> >
> >         3.2 The scope of the test.
> >         See the Submission Policy for a definition of Scope.
> >
> >         3.3 The clarity of the test.
> >         Clarity of a test is a determination of whether the
> aspect being
> > tested is
> >clearly described with the anticipated results acceptably explained.
> >
> >         3.4 The clarity of aspect of the Specification being tested.
> >         The Test Suite aims to test parts of the Specification
> and errata
> > that
> >aren't vague.
> >
> >         3.5 Should/shall use in the Specification.
> >         This is the same as "must" and "should", discussed in
> the Submission
> >Policy.  The test must clearly address a requirement in the Specification
> >that is a "shall" requirement and not a "should"
> >requirement.
> >
> >3.6 Determination of whether a test testing a discretionary item.
> >The Committee has developed a Catalogue of Discretion items,
> which includes
> >a listing of all options given to developers of the technology in the
> >Specification.  See the website for a list of discretionary items. [url]
> >Not all discretionary items are testable.
> >
> >       3.7 The atomic/molecular nature of the test
> >"Atomic" and "molecular" tests are described in the Submission Policy.
> >
> >4. Judge each eligible test through a process
>
> "processor"
>
> >5. Run each test through multiple processors.
> >Although there is no reference implementation, the Committee will form
> >consensus on which of the prominent processors to use.  The baseline is
> >unanimity of their results, as reduced to infoset-equivalence.
>
> How is this different than 4?

I think this is an error; it is supposed to be part of #4.

>
> >5. Differences between infoset-equivalence of Submitter and
> Reviewer output
> >will trigger examination by the Committee.
> >
> >6. The Committee will then reach consensus opinion to accept the test,
> >reject the test, or defer deciding on the test while the issue
> is forwarded
> >to the W3C for clarification.
> >
> >A test can be rejected by reviewers even if all prominent processors give
> >the same result when the test is not a true conformance
> >test. If reviewers think it's a good test but different reliable XSLT
> >processors give different results, issue may be Specification verbiage,
> >processor bugs, or unclear.
> >
> >There are several possible (non-exclusive) actions:
> >         6.1 Reject the test and update errata and errata exclusion
> >         [Question whether this is errata or gray area update??]
> >         The test would then be excluded from the published
> > collection.  The Test
> >Suite control file dictating which submitted tests are excluded from the
> >published collection is updated.
> >[??Furthermore, issuance of an erratum actually gives us a way to include
> >the test case, subject to filtering out at the final stage
> >of "rendering" a suite for a particular processor. Cris:
> >call me if you want a longer-winded description. ??]
> >
> >         6.2 Reject the test with advice to go to W3C.
> >In this case, the submitter thinks the test is accurate and the Committee
> >agrees the test is not accurate and the Recommendation is clear
> enough that
> >we needn't bother the W3C with an interpretation issue.  This requires
> >consensus [was that absolute or near consensus?].
> >
> >This scenario begins when the Submitter looks at the Committee report and
> >sees that a particular case submitted was excluded and writes to ask why.
> >The Reviewer will respond to explain. The response includes
> reference to the
> >W3C's mail alias for questions about the spec.
> >
> >6.3 The test case is forwarded to W3C for clarification
> >If the above options do not avail, the Committee can forward the
> test to the
> >W3C for clarification.
> >
> >         6.4 Additionally, the Committee may wish to accommodate
> external
> > comment
> >from the community at large.
> >
> >       6.5 The Committee will publish a consensus opinion of response to
> >comment with justification from Recommendation (not just
> precedence of how a
> >processor has acted).
>
> I had originally thought all of section 6 above would be part of the
> Submissions policy document as an explanation to the submitter,
> but I think
> now I agree it does belong in the Review policy document as a description
> of the review process.  The submission policy refers to the review policy
> so a submitter should make his way to this clause.
>
> Does this clause belong closer to the start of the document as an
> overview?

Since #6 is part of a process description that would be incomplete without
it, I'd rather leave it here.  We could make a summary statement of it at
the beginning, however, and point Submitters down to this section.

>
> >7. During the testing process, Reviewers will do the following:
> >
> >         7.1 A Reviewer will report to the list the hierarchy of tests
> > undertaken
> >for comparison with multiple processors.
> >
> >         7.2 A tally of tests will be tracked on a visible web
> page for the
> >Committee.
> >
> >         7.3 Reviewers report that all tests in a given
> hierarchy have been
> >examined, including a summary of findings of tests not to be
> included in the
> >resulting suite.
> >
> >         7.4 A given hierarchy is not considered complete
>
> "for a final release"

Okay.

>
> >until reports from at
> >least two members have been submitted.
>
> "A given hierarchy may be included in a draft only after at least one
> member's report is submitted".

Okay.

>
> >8. During the testing process, the Committee will invite public review:
> >
> >         8.1 An initial suite of a very small set of files will
> be used to
> > test
> >procedures and scripts and stylesheets.
> >
> >         8.2 The Committee will publish draft work periodically,
> starting

Okay.

> > with very
> >small set.
> >
> >         8.3 The Committee will solicit comments on usability of
> the product.
> >
> >         8.4 The Committee will publish a disposition of comments.
> >
> >         8.5 The Reviewers will continue testing the files until all the
> > hierarchies
> >are covered.
Follow-Ups:
- RE: Submission & Review Policies
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>
References:
- RE: Submission & Review Policies
  - From: "G. Ken Holman" <gkholman@CraneSoftwrights.com>