xslt-conformance message

Subject: Test Case Markup, Wooden Man edition
From: David_Marston@lotus.com
To: XSLT-Conformance@lists.oasis-open.org
Date: Fri, 29 Sep 2000 18:12:54 -0400
In this document, I describe information that should be associated with a
test case for (1) identification, (2) description and mapping to spec
provisions, (3) filtering (choosing whether or not to execute with a given
processor) and discretionary choices, and finally (4) some operational
parameters. I believe that it's fair to say that each test case is
represented by a stylesheet file, then use the operational parameters to
set up all inputs for the particular test case. The data described below
can be accumulated into a catalog of test cases, in XML of course, with one
<TestCase> element for each case. However, good code management practices
would probably dictate that the creators of these cases retain the
definitive data in the primary stylesheet file. A catalog file can be
generated from the stylesheets. The catalog file would be the definitive
version as far as the OASIS package is concerned.

At this stage, I am not attempting to answer the question of how the data
can be stored in the XSL file, if the provider has chosen to do so. Lotus
has chosen (so far, anyway) to embed each item in a special comment,
because the comments do the least to perturb the test harness. With Xalan,
it is possible to retrieve the values from these comments and perform
transformations on the stylesheets to obtain data about the tests. The
other approach that I could see is for OASIS to designate a namespace URI
for this information, and the values to be stored in top-level elements
associated with that namespace. Any XSLT processor under test would have to
be very conformant about handling of "foreign" top-level elements to be
able to run the tests at all. At our 9/6/2000 meeting, we agreed to
research whether OASIS has a policy on assigning namespaces under their
domain name.

Within the catalog, each test is represented as a <TestCase> element with
numerous sub-elements. The values would be text content of the
sub-elements, and most would be interpreted as strings. Values can be
interpreted numerically, specifically in inequality relations, when they
refer to versions, dates, and the like.

(1) IDENTIFICATION
Following on discussions at our previous meetings, each submitter of a
group of tests should choose a globally-unique "SuiteName", which string
should also be valid as a directory name in all prominent operating
systems. Thus, Lotus would submit a test suite called "Lotus" and the OASIS
procedures would load it into a "Lotus" directory. SuiteName does not have
to be a sub-element in the <TestCase> element. A submitted suite can have
arbitrary directory structure under that top-level directory, captured in
the "FilePath" element for each case, with forward slashes as the directory
delimiters. The actual name of the particular file (and test case) would be
in the "CaseName" element, which should be a valid file name in all
prominent operating systems. The FilePath contains the full path between
the SuiteName and CaseName, inclusive. Note that the test suite may contain
directories that have no test cases, only utility or subsidiary files.

OASIS may bless a particular hierarchical organization of test cases. If we
do, then a separate parameter called "Category" should be used to track
where the test fits in OASIS' scheme of categories. That way, OASIS
categories will not dictate the directory structure nor the case names. At
the 9//6/2000 meeting, a question was raised about whether a case might
occupy more than one category. If the purpose is clear, probably not, but
we need to assess the uses to which the categories will be put. We may need
a category named "Mixed" if we don't have a clean partitioning.

Submitters should be encouraged to use the "Author" element to name
contributors at the individual-person level. They may also wish to use an
element called "SubmissionDate" to record, as yyyymmdd, the date stamp on
the test case. That will allow the submitter to match cases with their own
source code management systems, and will likely aid in future updates,
either due to submitter enhancements or W3C changes.

(2) DESCRIPTION AND MAPPING TO SPEC PROVISIONS
Submitters should have a "Purpose" element whose value describes the point
of the test. This string should be limited in length so that the document
generated by the OASIS tools doesn't ramble too extensively. We might also
want an "Elaboration" element whose length is unlimited. Nothing in this
document should be construed as discouraging the use of comments elsewhere
in the stylesheet to clarify it.

There should be one or more "SpecCitation" elements to point at provisions
of the spec that are being tested. The pointing mechanism is the subject of
a separate discussion. The more exact it is, the less need there is for an
"Elaboration" string, and also the better inversion from the spec to the
test cases. The SpecCitation element contains a "Rec" sub-element to say
which spec (XSLT, XPath), a "Version" sub-element to say which version
thereof, and some form of text pointer. To encourage submissions before the
pointer scheme is final, we may need to accept alternative sub-elements of
different names: <Section> for a plain section number, <DocFrag> for use of
fragment identifiers that are already available, and <OASISptr> for the
scheme we ultimately choose.

(3) FILTERING AND DISCRETIONARY CHOICES
The XSLT 1.1 effort is underway, so we need to anticipate it in our test
case organization, even if we're only trying to cover version 1.0 right
now. In addition to being tied to the XSLT spec, the cases rely on a
particular version of XPath and will soon also involve XBase. XML
Internationalization or XInclude may also affect the test suites. Each
pertinent standard should be cited by version number, but also flagged as
to its errata status, if relevant. The Version elements mentioned above are
numeric so that inequality tests may be applied. The XSLT spec version
should always be present, and should be set to 1.0 if the test is really
about XPath or some other associated spec. In other words, any test that is
essentially pure XPath should try to rely on XSLT 1.0 for its XSLT portion
if at all possible. Any test that is essentially about a newer spec, such
as XBase, should specify the lowest practical level of XSLT, which may have
to be higher than 1.0 if XSLT modifications are necessary for the newer
facility to work at all.

We have begun to catalog discretionary choices available to the processor
developer, and these choices have names. These choices should be encoded in
elements which act as excluders when a test suite is assembled. By serving
as excluders, we eliminate the need to specify all 39 or so in every test
case; if a discretionary item is not mentioned, the test case doesn't care
about that item and should be included for any choice made on that item. I
hope that in most cases, the value can be expressed as a keyword from a set
of keywords designated by the committee. For example, the <Discretionary>
<signal-comment-non-text-content> element contains a (string) value of
either "error" or "ignore" to show that the case should be excluded when
the processor under test made the other choice on this item. Depending on
the choice, there could be parallel tests (differently named), with
distinct parallel "correct output" files, for different values of the
choice, and only one would be selected in any assembly of a test suite.
Question: does anyone care whether the sub-elements for the individual
discretionary items are under separate <Discretionary> elements or all
lumped in one?

Errata are independent of newer spec versions, and multiple errata could be
issued per version. The flexible approach is to have a SpecCitation
sub-element named "ErrataAdd" that takes on a numeric value like 0 (base
document), 1 (first errata issued on the indicated version of the indicated
spec), 2, etc. "ErrataDrop" ranges from 1 upward and indicates that the
test case is no longer pertinent as of that errata version. The Add and
Drop levels would allow a test case to be marked as being relevant for
errata that later get further clarified. ErrataDrop must always be
numerically greater than ErrataAdd. Spec errata parameters need only be
specified where the test applies to a specific erratum, or the base
document only, because they are used for filtering.

Vague areas in the spec would be handled in the same manner as the
discretionary items above, with <GrayArea> substituting for the
<Discretionary> and the abbreviated names chosen from The Catalog of Vague.
This is where the errata level is likely to come in to play, since errata
should clear up some vague areas. Once again, the tester has to ask the
developer to answer questions about their design decisions, and the answers
should be encoded using keywords which can then be matched to the
<GrayArea> elements. If we're clever, one test case could serve as both a
GrayArea for one choice and as the lone case for ErrataAdd, when that
GrayArea choice is the one that the errata later chose.

(4) OPERATIONAL PARAMETERS
At Lotus, we have thought a lot about how comments in the test file can
describe the scenario under which the test is run, though we have not yet
implemented most of the ideas. These parameters describe inputs and
outputs, and a <Scenario> element could describe the whole situation
through its value, which is a keyword. In the three "Standard" scenarios,
one XML file whose name matches the XSL stylesheet file is used as the
input document, and output is expected in a file that could then be
binary-compared to the "correct output" file. The exact names of the three
scenarios are "Standard-XML", "Standard-HTML", and "Standard-Text",
corresponding to the three methods of xsl:output and the possible three
methods of comparison. One or more <InputFile> and <OutputFile> elements
could be used to specify other files needed or created, and the values of
these elements should permit relative paths. A single InputFile element
could be used to specify that one of the heavily-used standard input files
should be retrieved instead of a test-specific XML file. (Lotus has
hundreds of tests where the XML input is just a document-node-only trigger,
and we would benefit from keeping one such file in a Utility directory.)
The implication of the latter rule is that if there exists even one
InputFile element, no inputs are assumed and all must be specified.

If the Scenario keyword says "ExtParam", then the processor should be
launched with parameters being set via whatever mechanism the processor
supports. We may want to push responsibility to the processor developer to
provide a script/batch mechanism to take values in a standardized way and
map them to the specific syntax of their processor. We would still need to
define a method, probably involving an extra input file (i.e.,
<CaseName>.ini) but possibly using more parameters in the test case, where
the test case can store the parameter names and values.

If the Scenario keyword says "XMLEmbed", then the XSL stylesheet wasn't
really wanted, and the test should run as if the XML file sufficed. The
stylesheet file should probably do nothing and contain only comments.
Nevertheless, we may again want the processor developer to supply a
mechanism to set this up, since the way in which the stylesheet is marked
inapplicable will vary.

We also want to be able to test that a message was issued (as in
xsl:message) and that an error was issued. I propose that elements
"ConsoleStandardOutput" and "ConsoleErrorOutput" be used to designate
strings that must be present in the respective outputs. The Scenario
keyword "MatchStandardOutput" or "MatchErrorOutput" would be an instruction
that says: when running this test, capture the standard/error output into a
file, and ignore the normal transformation output. The test of correctness
is to grep for the designated string in the captured output file. If a
tester wished, they could get actual error message strings from the
processor developer and refine the test harness to search for those exact
messages in error output. In that case, the string in the
ConsoleErrorOutput element is used as an indirect reference to the actual
string.

Additional "Scenario" keywords can be devised as necessary, but OASIS
should control the naming. We might want to allow names beginning with a
specific letter to be local to particular test labs. For example, we would
reserve all names beginning with "O-" and instruct the testlabs that they
should put their name as the next field, then another hyphen, then their
local scenario keywords (e. g., O-NIST-whatever) that allow them to set up
local conditions as needed.

HOW IT WORKS
When generating a specific instance of the test suite, a test case can be
excluded on any one of the following bases:
A Discretionary item of a given name is set to a different value.
A GrayArea item of a given name is set to a different value.
The SpecCitation/Version value on the test case is numerically larger than
what the processor implements. (This could be for any spec named, not just
XSLT.)
There is a SpecCitation for a spec (e.g., XBase) that the processor claims
not to implement.
The test lab wishes to test against an errata level that is numerically
lower than the ErrataAdd or higher than the ErrataDrop for a spec.
Thus, it is the "user" (test lab) who renders a test suite by deciding
which spec version and errata level they wish to test, and by specifying
the settings of the Discretionary and GrayArea items they know. Before
running the specific rendition, they must ascertain how they will handle
those tests that run in ExtParam and possibly other scenarios, taking into
account the operating system where the tests will run and
processor-specific input and output design.

Note that the test suite itself is not filtered by Scenario values. The
test lab may wish to devise a harness that can be configured to exclude
certain scenarios from some runs, but I think we want to encourage testing
and reporting against the full range of scenarios.

When a test case is included, it is run according to the value of the
<Scenario> element. If inputs are specified, they are marshalled as
necessary. If no inputs are specified, a single file named <CaseName>.xml
is assumed to be the input. In some scenarios, special steps must be taken
to capture the output. In the standard scenarios, if no outputs are
designated the name of the intended output file is generated from the the
final part of the scenario name and from the CaseName. (Probably
<CaseName>.xml, <CaseName>.html, and <CaseName>.txt unless someone has a
better idea.

Comments on this proposal are encouraged now, so we can start getting
annotations in place on existing test cases.
.................David Marston