dita message

Subject: Re: [dita] Testing of RNG / DTD / XSD
From: Eliot Kimber <ekimber@contrext.com>
To: DITA TC <dita@lists.oasis-open.org>
Date: Tue, 15 Apr 2014 11:09:38 -0500
The testing I've set up so far is in SVN under doctypes/test/tools. It
currently consists of the following components:

1. A set of valid map and topic base documents, at least one per shell,
that should be valid against their corresponding shells.

These are intended primarily to simply test that the shell document types
are parseable (and by extension the modules they integrate) but also
contain element and attribute instances that test edge cases or markup new
to DITA 1.3 or cases that were incorrect in earlier versions that I wanted
to regression test.

2. A set of invalid map and topic base documents, that should not be valid
against their corresponding shells. This set is not yet complete--I've
been adding to it as I've found valid cases were in fact not valid. I also
don't yet have a way to automatically verify that the documents are
correctly flagged as invalid (see below).

3. The Ant script build.xml in the doctypes/test/tools directory that does
the following:

A. Uses the base valid docs to generate grammar-type-specific instances
for each grammar type (DTD, XSD URL-based, XSD-URN-based, RELAX
NG-URL-based, RELAX NG-URN-based, RNC-URL-based and RNC-URN-based).

B. DTD validation using the Ant <xmlvalidate> task.

C. DTD validation using Saxon

I need both forms of DTD validation because the <xmlvalidate> task fails
completely if any DTD is not parseable, while Saxon does not. So between
the two I seem to get complete coverage of parsing issues.

D. XSD validation using the Ant <schemavalidate> target. This seems to
accurately report issues with the XSDs, meaning I don't seem to be getting
any false negatives with the currently-generated XSDs.

For RNG/RNC the Jing tool, which is the only available Java-based RELAX NG
validator that I know of, does not support catalogs, so it's not currently
possible to validate the URN-based RNG/RNC documents, shells, and modules.
I'm actively working on getting URL-based RNG/RNC validation set up
through Ant but it's been a lower priority than XSD generation. It should
just be a matter of calling jing from Ant using the Ant <java> task.

I've also been doing "unit testing" via direct validation in Oxygen as I
work through things, but I've been depending on my automated test suite to
validate my work as I do it.

I've also set up a Jenkins server on CloudBees that runs these same tests
automatically any time anything new is committed under the doctypes/
directory in SVN. This automated regression testing is intended to be a
check once the RNG-to-DTD/XSD/RNC is generally proven to be correct so
that we can make adjustments to the vocabulary or tools and have them
automatically verified in a publicly-visible place. I've set it up to send
me email for any failure and it can easily be configured to send email to
anyone, including the DITA TC mailing list if appropriate. That is, once
we have all the tests passing for 1.3, I think it makes sense to have the
automation inform the TC if anything breaks since any breakage would
unexpected and bad.

Right now the test does a simple log analysis looking for any error or
warning message and considers the test to have failed if it finds any
error or warning. This analysis could be more sophisticated but this is
good enough for now.

Additional testing that needs to be implemented include:

- Testing correct detection of invalid documents. Unfortunately, with
simple parsing plus log analysis for "error" there's no obvious way to
treat invalidation of documents as success rather than failure. The
approach I've been thinking of is to implement a Java class that
essentially inverts the messages from a parser, reporting failure as
success and success a failure. This should be easy enough to do. It might
also be possible to do it directly in Ant by capturing a log using an Ant
log recorder and then applying a different regular expression to the log
or transforming the log via text replacement and injecting the result into
the final log analyzed.

- Adding additional test case documents that exercise more cases and
therefore provide more detailed checks of specific content models. These
documents are tedious to author.

- Implement schematrons that validate the RNGs themselves--George Bina
implemented some at the start of his work on DITA RNG but the RNG details
have changed since then and I haven't been able to update the schematrons
to match. But the schematrons can check many details, including
specialization requirements (correct specialization hierarchy, @domains
values, etc.).

- Use the OT preprocessing as an additional check where the schematrons
are not sufficient (not sure what that might be).

Cheers,

Eliot
—————
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 4/15/14, 9:43 AM, "Kristen James Eberlein"
<kris@eberleinconsulting.com> wrote:

>
>  
>
>    
>  
>  
>    
>    
>      
>        
>          
>            Subject:
>            
>            Testing of RNG / DTD / XSD
>          
>          
>            Date: 
>            Tue, 15 Apr 2014 08:25:53 -0500
>          
>          
>            From: 
>            Robert D Anderson <robander@us.ibm.com>
><mailto:robander@us.ibm.com>
>          
>          
>            To: 
>            Eliot Kimber <ekimber@rsicms.com> <mailto:ekimber@rsicms.com>,
>              chris.nitchie@oberontech.com, dhelfinstine@ptc.com, Scott
>              Hudson <scott.hudson@schneider-electric.com>
><mailto:scott.hudson@schneider-electric.com>
>          
>          
>            CC: 
>            Kristen James Eberlein
>              <kris@eberleinconsulting.com>
><mailto:kris@eberleinconsulting.com>, Eric Sirois
>              <esirois@ca.ibm.com> <mailto:esirois@ca.ibm.com>
>          
>        
>      
>      
>      
>      I was supposed to start this
>          thread last week but fell down on that. Better late than
>          never.
>        
>        We need to discuss how we want
>          to handle testing of the RNG -- what policy do we want to have
>          in place, who is doing the work, and (if needed) how can that
>          work be repeated by others. I think it's important that
>          whatever our Official Process becomes, anybody should be able
>          to set it up and repeat it with minimal work. That was not the
>          case with DITA 1.2, where everything relied on a long series
>          of tools and scripts on my own system.
>        
>        High points - here are the
>          things I did while testing with 1.2:
>        * Kept an XML rendering of each
>          doctype
>        * For each new feature:
>        ** Integrate the new change
>        ** Verify that the DTD still
>          parsed (run through my generally very picky Omnimark parser,
>          open in a validating editor)
>        ** Verify that the desired new
>          markup was there
>        ** Regenerate the XML version
>          of the DTD, and do a diff to ensure no unintended changes
>        * Repeat for each new feature
>        
>        For 1.3 I think Eliot has
>          already been doing some of this - validating with parsers, and
>          ensuring that the new markup is available.
>        
>        Do we want to keep up the "make
>          sure no unintended consequences" test, and if so, how? I think
>          this is much more difficult with doctypes that are already
>          essentially complete (it's easiest when checking as each
>          feature is added).
>        
>        Do we have tools that can do
>          other DITA based validation -- ensure that the specialization
>          is correct, maybe catch a Learning and Training element that
>          has an incorrectly constructed class attribute, etc?
>        
>        Who here wants to sign up for
>          testing of RNG, DTD, or XSD? We don't want to be wasting time,
>          but it might be a good thing if we're doing some of this
>          testing with different parsers, for example -- I've found
>          things in the past that opened OK in Arbortext, while Omnimark
>          threw out an error, or vice versa.
>        
>        Thanks,
>          
>          Robert D Anderson
>          IBM Authoring Tools Development
>          Chief Architect, DITA Open Toolkit
>(http://dita-ot.sourceforge.net/)
>      
>      
>    
>    
>  
>
>
>---------------------------------------------------------------------
>To unsubscribe from this mail list, you must leave the OASIS TC that
>generates this mail.  Follow this link to all your TCs in OASIS at:
>https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
>
Follow-Ups:
- Re: [dita] Testing of RNG / DTD / XSD
  - From: Eric Sirois <esirois@ca.ibm.com>
References:
- Testing of RNG / DTD / XSD
  - From: Kristen James Eberlein <kris@eberleinconsulting.com>