[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Thinking output comparing output
Carmelo sent me an email in which he mentioned some dilemmas relating to comparing output of a test run against the prescribed output. As I was thinking about presenting my take on the issues, I decided to send this to the whole list. (I have also sent this to my colleague Shane Curcuru, who thinks about this a lot on behalf of Lotus. He may send along his thoughts.) The possible outputs of interest from a test case can be broken down in a tree-like structure. I had earlier proposed that each test be annotated with a "scenario" parameter to describe the framework in which it should be run, which mostly means the inputs and outputs. Additional "output" parameters would name the particular outputs that must be compared, and there could be more than one. This is in Part 4 "Operational Parameters" of my memo entitled "Test Case Markup, Straw Man edition" sent on 9/5/2000. To some extent, this discussion gets into areas that we may choose to leave up to the discretion of the test labs that use our suite. I don't believe we have complete precision on the location of that dividing line. We should go beyond the line in this analysis to insure that submitters provide adequate information with their tests. Here's the tree of output possibilities: THE INTERESTING OUTPUT IS THE TRANSFORMATION RESULT In all cases in this group, the easy approach is to send output to a file, but that introduces a certain amount of post-XSLT processing into the test. We could add some questions about "serialization to a file" to our questionnaire for developers, or perhaps have a mini-suite of tests that reveal the details of that serialization. -1- XML output We could apply XML Canonicalization (See http://www.w3.org/TR/xml-c14n for latest draft) and then do byte-wise comparison. If the processor supports SAX events as a form of output, one could avoid many of the file-output issues by just responding to those events, but how do you introduce the representation of the correct output? Similarly, one could try to verify a static in-memory representation of the output. -2- HTML output I suspect there are tools that tell you if two HTML files will produce the exact same appearance when viewed in a browser, though I can't name any. We also need to check correct generation on invisible items like comments, leading back to many of the same issues as with XML. -3- Text output This output can vary so much that some form of byte-wise comparison seems inevitable. There could be line-ending issues when the output is a file. -4- Future: several outputs of mixed type The W3C Working Group favors formalization of multiple output documents, with varying formats among them. See http://www.w3.org/TR/xslt11req and keep in mind that this is for 1.1, so it's coming soon. Thus, any scheme for designating the type of output (and hence the method of output comparison) should be flexible enough to allow multiple outputs of multiple types. THE INTERESTING OUTPUT IS ACTIVITY IN THE OPERATING SYSTEM -5- Processor raises error The first question is how automatic this should be. Detecting the fact of an error drags in variables such as the operating system and/or the language in which the processor was implemented. Capturing the error message and isolating the interesting portion might also involve operating system specifics. And do we dare to address the content of the messages? The person who is choosing a processor based on the results of this test battery could very well want to know how well the error messages lead to the problem, but that's not conformance. At a minimum, I think that submitters of deliberate-error test cases should include a sample message text that is as specific as possible. -6- Processor sends "message" somewhere This provision is specifically for tests of xsl:message, which is granted wide discretion about where the message goes. Information about where it goes and how it's serialized would have to be solicited on the developer questionnaire. But the test case could specify an exact byte stream (except perhaps for line-ending characters) that should be emitted. -7- Both of the above This, too, applies to xsl:message, when you are testing the terminate option. This could be a separate scenario or may simply arise from specifying multiple expected outputs as a blend of -5- and -6- above. THE INTERESTING OUTPUT IS BOTH At this time, I'm not convinced that we need to explore this too heavily. Given that both of the main branches above have the potential for multiple outputs, we cover this area adequately if we don't create clashes of notation. ANOTHER ISSUE: Carmelo speculated about storing the output stream(s) directly in the test catalog. I think that raises numerous operational difficulties, such as when a lab tries to transform the catalog to use its data in a test harness. He also mentioned having a pointer to a file containing the correct output. I think that having the correct output in a file would still allow forms of comparison other than file-to-file, if the harness developer so decides.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC