OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] Refentry, man pages, and DocBook XSL 1.69.0


Hi Peter,

You wrote:

[...]

> That said, for now we'll continue to use the docbook-to-man package for 
> *roff conversion, though I'll be using XSL for HTML conversion. 
> docbook-to-man (originally by Fred Dalrymple and now under the Debian 
> umbrella) is 3-4x faster than the XSL, and provides the same functionality. 
> I've modified the Debian version to work with XML (originally it was SGML 
> only).
> 
> I'd be interested in any ideas about the speed difference; I assume it's 
> because docbook-to-man doesn't use the DOM and just streams (using OpenSP 
> and a stylesheet transformer called Instant).

I do not know much at all about how docbook-to-man works.[1] Maybe
Mr. Borgert (who is its current maintainer, I think) can comment.

You mention that docbook-to-man is 3-4x faster than the current
XSL manpages stylesheet.[2] I know that kind of speed improvement
may or may not make a big difference, depending on how many pages
you process at time.[3]

I realize the stylesheet is relatively slow, but I'm interested in
know just how slow. Can you comment on the following?

  - your total build time with the manpages stylesheet versus your
    build time with docbook-to-man

  - how many Refentry instances you're processing during each build

  - what XSLT engine you use

Anyway, I should tell you that speed optimization is not a primary
design goal of the manpages stylesheet, or of the DocBook XSL
stylesheets in general. If speed is a high priority for you, I
think that, in general, using XSL is not the best way to go, and
in particular, using the DocBook XSL stylesheets isn't.

That said, I have tried not to make the manpages stylesheet any
slower than it absolutely needs to be.[4] And there are some
parameters you can feed to it to make it run faster. For example:

  xsltproc \
    --stringparam man.output.quietly 1 \
    --stringparam man.charmap.enabled 0 \
    /sandbox/xsl/manpages/docbook.xsl \
    patchutils.xml

Which will cause the stylesheet not to emit a message each time it
writes a man page to the filesystem, and cause it to skip using
the new "character map" feature (which isn't always needed).

For many documents, if you run the manpages stylesheet with the
above options, the time it takes is roughly on the same order as
the time needed to process the same doc with the html/docbook.xsl
stylesheet from the DocBook XSL stylesheets distro. And it is
faster than processing it with the html/chunk.xsl stylesheet.[5]

So I do not think the manpages stylesheet is excessively slow
relative to generating a different output format for the same
document via XSLT.

However, I can understand it being much slower than a solution
such as docbook-to-man that does not use XSLT. I very much doubt
that no matter what speed optimizations I tried to implement, the
manpages stylesheet would ever end up coming anywhere close to
being able to generate output as quickly as a non-XSLT solution.

  --Mike

[1] From what I've seen, docbook-to-man does seem stream-based.
    Which, among other things, means (unless I misunderstand),
    that it has limitations that the manpages stylesheet does not.
    For example, I don't think it can reorder any document
    content. If so, that means it cannot be made to handle
    Footnote correctly, and maybe some other things as well.

[2] I actually would not be surprised if docbook-to-man were much
    more than 3 or 4 times faster in some instances -- documents
    that contain a lot of Ulinks, for example, or ones that
    contain a lot of Unicode special characters and symbols.

[3] For example, in the simplest case, if you are processing a
    single man page, and it only takes less than 2 seconds to
    process it with docbook-to-man, and 5 to 7 seconds to process
    it with the manpages stylesheet, it's not such a big deal. But
    if you are processing 100 man pages and it takes 3 or 4
    minutes to process them, instead of 1 minute, it matters.

[4] I guess I should mention that the 1.69.0 version takes at
    least 10 or 20 percent longer to process a particular document
    than the 1.68.1 version does. Part of the reason is that it is
    doing a lot more things than the 1.68.1 version. The 1.70.0
    version is likely to be even slower still.

[5] Below are some actual numbers based on running Tim Waugh's
    patchutils.xml file through the HTML & manpages stylesheets.

    Note that patchutils.xml is 2315-line document containing 14
    separate Refentry instances.

      manpages w/o output.quietly=1 & charmap.enabled=0     3.2s
      manpages w/  output.quietly=1 & charmap.enabled=0     2.3s

      html/chunk.xsl w/o chunk.quietly=1                    3.6s
      html/chunk.xsl w   chunk.quietly=1                    2.9s
      html/docbook.xsl                                      2.1s

smime.p7s



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]