[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Halfway to genericizing the test case catalog
About date data: I'm anticipating that submitters will send updated test suites in the future, possibly for XSLT 2.0. The submitters will have a provision for tracking the revision date of each individual test case. We only care about that when we are accepting later submissions and want to avoid re-reviewing tests that we reviewed before. Therefore, I'm comfortable to leave supplied dates unmodified. Ken suggests we may want to apply a date to the whole submission, such as the date we received it. That's okay with me, but our reasons for wanting to may actually apply to each individual case, as described above. More about Identifier: >><!-- Identifier uses forward slashes as separators, begins with the name >> of a directory that is directly within the top directory named per Title, >> and ends with the name-part in Source. --> >><!ELEMENT Identifier ( #PCDATA ) > GKH>If we remove Title, as I think we should, then the comment above would GKH>change. I don't think so. The Identifier does not include the Title, so it doesn't care about the name of the top-level directory of the tree. I picture the submission coming as a single directory tree of all test cases and other input files, thus it will have a directory name at the top, but that doesn't require that we use the top-level name as supplied. But we must retain the rest of the supplied tree structure, or else we would get into the business of modifying tests to change things like file paths in xsl:import directives. We could relax the requirement for the submitted catalog to have the Title element, and possibly "category" values for each test, yet impose on ourselves the requirement that the catalog we ship has them. Creator data: >Extending this to the entire suite, I've added a Creator and Date >children to <test-catalog> to record the information regarding the entire >collection (again from the submitter's POV). I discussed Date above. We should be more clear about what would go in a whole-suite Creator field. I think it wouldn't really be the creator(s) in the authorship sense, but rather the person who sent it in, in which case "contact" is a better term. Could a submitter wish to keep their contact information (email address) hidden? How about calling it "submitter" and expecting the name of the organization? Or having both "submitter" and "contact"? Validating the catalog: >If we decide that we will still need a validating XML >processor to validate the structure of our submitted catalogues... What we need is to ensure that the catalogs can be merged and that the Test Lab can perform a rendition specific to the test platform and processor's discretionary choices. I think that means that we have to either check structure or be responsible for fixing catalog bugs. Why test cases need to specify discretionary behavior: >Okay, now during my prototyping I don't see why a test file would specify >behaviour ... the discretionary document describes possible behaviours, >not the individual test. The test case is written to assume one particular choice. For example, consider a test case that has <xsl:element name=" bad name!"> and comes with a correct-output file showing the pass-through behavior. It must only be in the rendered suite when testing a processor that chose the pass-through option. There may be a parallel test, or an equivalent test in a different suite, to test the raise-error behavior (the other option) and it would have catalog data indicating that an error occurs. The pass-through option must be implemented in a specific way, so it is possible that a buggy processor will raise an error it didn't intend or will instantiate the wrong content where the bad element was requested. Therefore, if the processor under test intends to do pass-through, its output can be compared against the correct output, and it can pass or fail that case. Discretionary contrasted with gray areas: >Given the possible transient nature of a gray area to a discretionary >area, Actually, that very seldom happens. Usually when the WG resolves a gray area, they specify one behavior as correct. In the xsl:number cases for our prototype, there was originally a gray area about what to do when a number is negative, especially concerning A and I formats. The erratum dictates one behavior. A test case that anticipates this behavior can be encoded for both this behavior on the gray area and the subsequent erratum as non-gray. I'm sure it will be a pain! Read on! >...does it make sense to just call them all discretionary and the >verbiage associated with each will acknowledge their status? No! Gray areas will probably be subject to repeated fine-tuning. The set of discretionary items is fixed (barring errata creating new ones) and represents conscious intent of the WG. Every gray area is a mistake on the part of the WG, so naturally there is no complete list of all of them. The #1 reason to keep them separate is to not pollute the discretionary list with all the transient junk in the gray list. We can hope that in future specs, WGs will actually include a normative list of the discretionary choices as an appendix. More about operations: >Regarding "operation", this would force a submitter to constrain >themselves to what the committee expects to be allowed ... >I guess that is okay ... Think of it as environmental variations that are acknowledged in the spec. It does take careful brain-work from the Committee and/or the WG writing the spec, but it should be universal. Consider this: in the ten months (!!) since the original Straw Man came out, nobody has come forward to identify any XSLT operational scenarios other than the original three: standard, embedded stylesheet, and parameterized. Those three can be discerned in the original spec, but we have the uncomfortable situation that the WG did not produce a list of calling scenarios. (The spec also allows the processor to check the locale (internationalization settings) in which it's running in very narrow circumstances, but I think those are all "should" provisions anyway.) I think that Ken's uncertainty, expressed above, can be tied in to a larger question of how far we go in defining Software Quality Assurance or testing practices. If we don't have a list of calling scenarios from the WG, and we don't create one, then the users of our suite have a great hassle in assessing the union of the scenarios that came in, uncoordinated, from the submitters. Isn't it fair for us to side with the Test Labs and push the submitters to conform to a fixed list of scenarios? That encourages them to think about their test cases as processor-independent conformance tests, which is what we need. Splitting message compares away from others: I would like to revisit this question when we have more experience with the prototype. (As indicated earlier, we may have to make compare be an attribute of each individual output file, rather than the scenario as a whole. Look at xsl:document as proposed for XSLT 2.0 and you'll see what I'm anticipating.) But our catalog is only at the Iron/Germanium stage now, meaning that it is time to really exercise it. I made message compares look very similar because I expect the console output to be captured into a file, but it could be that other specs to be tested under this regime could have numbered errors or some other precise expression that the WG creates. Another reason to wait is that we would benefit from feedback from test labs about how automated they want such cases to be. Responsibility for environment preparation: >>The Committee could push responsibility to the processor developer to >>provide a script/batch mechanism to take values >>from standardized data and map them to the specific syntax of their >>processor. >Can we not leave this to the testing organization? I think it is out of >scope of our committee work. The above was written as a generic form of the issues about setting up input. To instantiate for our committee, think about setting parameters to be passed in as top-level ones. The processor developer is absolutely responsible for stating how it is done with their product. However, certain test cases come with parameter-setting data (format TBD, as we all know). The Test Lab has to take the data that came in our test suite and transform it into API calls or command-line options or whatever to set the parameters as needed by the processor under test. If the Test Lab simply relies upon flimsy documentation, they may get it wrong in subtle ways on some processors, and those processors may look worse in the test results because of it. Thus, the developer has an incentive to ensure that all labs understand how parameters are set for their product. We, on the other hand, want to ensure that our tests are repeatable: that different labs working independently will obtain the exact same results for a given processor and environment when using the same version of our suite. I'm suggesting by the above verbiage that a committee can make the vendors aware of their incentives. Generic terms for OASIS committees: >>Additional "scenario" keywords can be devised as necessary, >>but OASIS should control the naming. >The configuration instance can control that. We're saying the same thing here. A given Committee using this design for a test suite looks at their spec and develops a master list of testing scenarios, both regarding "operation" and "compare" aspects. They set their configuration instance accordingly. .................David Marston
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC