[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Strip docbook-5 to content only
On Mon, 24 Mar 2014 10:51:55 +0100 Stefan Knorr <sknorr@suse.de> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Hi Dave, > > On 23/03/14 12:32, davep wrote: > > I'm playing with a grammar checker that isn't as yet XML friendly. > > One option is to strip all markup and pass through to the grammar > > checker having expanded any xincludes. > > Interesting -- what checker do you use, if I may ask? https://languagetool.org/ With a few niggles it is grammar checking Docbook 5 with few problems. > > > > Issues: 1. Plain text output, Ideally block -> newline, inlines > > ->whitespace separation. 2. Indexing is a special. Null template > > for <db:indexterm/> 3. Ditto (remove markup) for toc > > > > Can anyone think of any other 'specials' that might need > > processing to obtain a simple text file ready for a spell checker? > > Since I am trying to implement some sort of style/terminology checker > here, here are the rules I use to prepare the text before the > terminology check: > > https://www.gitorious.org/style-checker/style-checker/source/999eb9696fed15e75b01eee2febbb28562fc3144:source/xsl-checks/terminology.xslc > > You can see that I try to hide things like literals and keys from the > style checker. The ##@sth## format is because I am using regular > expressions and wanted a format that is distinctive but does not > contain any regular expression characters. "Style"? I'm not sure I understand what is meant by style Stefan? LT is very much a grammar checker in the classical style - they are trying to integrate it with Open Office and other such tools. I'll try your stylesheet if I may - a good starting point. -- regards -- Dave Pawson XSLT XSL-FO FAQ. http://www.dpawson.co.uk
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]