xslt-conformance message

Subject: Re: Questions on the Submission and Review Policies
From: David_Marston@lotus.com
To: xslt-conformance@lists.oasis-open.org
Date: Mon, 02 Jul 2001 15:33:09 -0400

I'm skipping questions where I have nothing to add.

1. Submission Policy (bullets with questions)
[Q: What suggestions do you have for an introductory paragraph?
A: In addition to what Ken and Lynda said: We are working toward thorough
coverage by accumulating tests, rather than reading the specs over and
over and trying to generate a complete checklist.

The following ideas were captured regarding submission process and
practice:

  - prefer atomic tests for 1.0
[Q: What are atomic tests?  Who prefers them?  Prefers them to what?
A: Ideally, an atomic test checks just one assertion of the spec. It may
require other functionality to set up the circumstances of the assertion,
but in an assembled suite, the dependent functionality can be treated as
other atoms in other tests. The Committee is preferring. The other end of
the scale is the Mozilla test: one big test that touches everything.

Q: What does 1.0 refer to?  The test suite?  Is this defined?
A: Our test suite, though it also happens to apply to XSLT and XPath.

    - target specific language issues, not composite issues...
Q: What are composite issues?
A: If I recall correctly, composite issues referred primarily to tests
that would trip up the XML parser, but could also refer to any other
standard (Namespaces, XBase, etc.) that XSLT uses normatively. XPath
is in scope, however, so an XSLT test that primarily exercises XPath
expressions is acceptable.

    - consider others later
[Q: Other what?  issues?  What other issues?  When should they be
considered?
What boundaries exist on what we should consider and when we should
consider
it?
A: I would just subsume this into the general discussion, saying that later
we will look at tests involving parser issues, XSLT errata, "should"
provisons, etc. but we won't necessarily promise to test all those aspects.

  - committee reserves right to exclude any test submitted
[Q: What possible reasons might the committee have for exclusion?  Is there
any
formal process for notifying the submitter of exclusion?
A: To be listed below. At this stage of the document, just say that one's
submitted tests will be accepted/rejected at the individual case level. I
don't think we'd notify the submitter; those who are really concerned
should join the committee! CONTRAST: We do not reserve the right to alter
an individual test case, just its catalog data.

  - prefer no "should" decisions for 1.0 suite of tests
[Q:What does this mean? How does this impact submitters?
A: Places where the spec says "should" rather than "must" --many examples
in Chapter 16 (Output) of XSLT.

    - target only explicit processor choices, not unspecified areas of
Recommendations
[Q: Does this refer to W3C Recommendation on XSLT?  Does this mean the test
suite will only target choices made by the XSLT processor vendors?  Why?
(limitation on scope?  other?)
A: This means that we will test that portion of The Catalog of Discretion
that is deemed "testable" and where a question or two can clearly elicit
the choice may by the developer.

GENERAL PRINCIPLE: Everything in version 1.0 of the test suite should be so
clear that when a processor fails any test case, there is no "wiggle room"
for them to claim they really conform.

  - test identification:
[Q:What is test id?
A: See the Iron Man document for details on this and subsequent questions.
What the Committee ships will have a directory tree of test cases, where
the first level down the tree separates out the tests by submitter (Lotus,
Saxon, NIST, MSFT, etc. -- names that Committee must keep unique), and then
presumably follows each submitter's own directory structure below there.

    ...- each test will have a unique identifier as well
[Q: What is the identifier?  As well as what?  By test, do you mean each
test file or each submission or test performance, or something else?
A: I'm not sure where that came from. I think the only unambiguous
identifier is the one derived from the full file path. I think we should
SUGGEST that the name of each test case be globally unique across all
cases from that submitter, but it won't break anything if they aren't.

  - test scope will be identified by Recommendation and date
[Q: What is test scope?  Does "Recommendation" = W3C xslt Recommendation?
date of what?  Submission?  Other?
A: See "spec-citation" portion of Iron Man. It's not really the "scope"
but rather pointers to the parts of the spec(s) where one could find the
pertinent sentences. Continuing the GENERAL PRINCIPLE from above: when a
processor fails a case, one should be able to use the citations to find a
sentence in the spec that the processor violated.

[Q: What else should be included in the Submission Policy?
A: Tests become public. We intend to retain names of submitters (personal
names) so people should get public credit for their work. No royalties.

2. Review policy

[Q: What suggestions do you have for an introductory paragraph?
A: All accepted tests should are intended to test conformance; when a
processor can fail the test and still be valid, that test should be
excluded. Then qualify that last part by introducing the Discretionary
stuff, which is a late-stage filter applied by the Test Lab.

  1 - judge the eligibility of a test by:...
      - accuracy of test
[Q: What does accuracy mean?  What is the baseline for determining it?
What is the means for measuring it?
A: Test matches the cited parts of the spec.

...- clarity of test
[Q: What is clarity?  How is it measured?
A: Likelihood of no arguments when a processor fails

      - clarity of aspect of recommendation being tested
[Q: What is clarity of aspect?  Does this refer to W3C Recommendation?
A: Yes. We can continue to submit editorial issues to the WG. To
promote acceptance of our test suite, however, we should emphasize
those parts of the spec (and errata) that aren't vague.

      - should/shall use in the recommendation
[Q: What does this mean?  What recommendation?  Who should/shall?
A: Same as should/must answer in Part 1 above.

      - is the test testing a discretionary item?
[Q: What is a discretionary item?  Defined where by whom?
A: Defined by the Committee (unfortunately) in The Catalog of Discretion.
Note that not all discretionary items are "testable" ones.

      - atomic/molecular nature of the test
[Q: What does atomic mean? (asked above) What does molecular mean? What is
meant by nature (specifically)?
A: Essentially that one failure points out one bug. Converse not true: one
bug may cause failure of dozens of tests that involve various invocations
of the bug situation. (Example: XT doesn't support xsl:key, so XT will be
judged non-conformant in that respect. Other processors may implement keys
quite well, but have a certain problem exposed in one case. That case may
have to be "molecular" if several "atoms" must interact in a certain way
to expose the bug.)

  2 - judge each eligible test through a process
[Q: Who is judging?  Who is being judged?  What process?  Where defined?
A: Reviewers on the Committee judge individual tests. The whole project
should deliver a test suite that various "Test Labs" can use to assess
conformance. The Labs will look to the Committee for assurances that only
true conformance tests have survived the process.

      - run thorugh multiple processors
[Q: xlst processors?  Whose?  Which one is the benchmark or baseline, or is
there one?
A: Several prominent processors. The baseline is unanimity of their
results, as reduced to infoset-equivalence.

...- consensus opinion to accept the test, reject the test, or defer
deciding on the test while the issue is forwarded to the W3C for
clarification
[Q: What does this mean
A: Part of our process after the 1-2 reviews. Test can be rejected by
reviewers even if all prominent processors give the same result when the
test is not a true conformance test. If reviewers think it's a good test
but different reliable XSLT processors give different results, issue may
be spec verbiage, processor bugs, or unclear.

        - possible actions:
        - reject test and update errata and errata exclusion
[Q: What does it mean to reject a test?  In what form is rejection
communicated?
What is included in the rejection message?  What errata?  What exclusion?
A: I THINK this means that we may generate new "gray areas" for our list of
gray areas as we go along, but "errata" is a misnomer because only the WG
issues those. Furthermore, issuance of an erratum actually gives us a way
to include the test case, subject to filtering out at the final stage of
"rendering" a suite for a particular processor. Cris: call me if you want a
longer-winded description.

          - reject comment with advice to go to W3C if the submitter is not
convinced
[Q: What does this mean "reject comment"?  Who advises the submitter?
Convinced
of what?
A: This scenario begins when the submitter looks at our report and sees
that a particular case s/he submitted was excluded and writes in to ask
why. We (probably in the person of someone who reviewed the particular
case) responds to explain. The response includes reference to the W3C's
mail alias for questions about the spec.

          - forward to W3C for clarification
[Q: What is forwarded?  By whom?  Clarification of what?
A: We could ask our own questions, or a person whose case was rejected
might want to ask a question.

      - accommodate external comment from the community at large
[Q: Who will make this accommodation?  How?  Comment on what?  Who is the
community-at-large (specifically)?
A: Judging by the indent level of this item, I believe that this refers
to our plan to publish the collected tests at the same time that we START
assigning them to reviewers. Thus, outside observers (lurkers on this
list) will know the starting point of our reviews. If they have any
comments, I suspect it will just be to urge rejection of particular cases.

[Q: What else should be included in the Review Policy?
A: Clear statement that we only use normative material from the W3C.

  3 - game plan for tests
Perhaps add that a person should not review tests they wrote.
.................David Marston