OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: FW: YAML spec request


Hi,

When I asked about EBNF things on the list, it was suggested I talk to the
writers of the YAML spec. I did, and Oren wrote back recently. The following is
his first message, and then his reply to my request to put it on the list. I
thought it would be good to get some "views".

The first message is aimed at how his EBNF mechanism actually works. The second
is a more general description of the changes he's made to DocBook for his
authoring work, and why he wanted them.

Note particularly (from the end): "My main problem with DocBook is that if you
want to excert any control over the presentation, you have to step "outside the
system". My thesis from 1992 still renders perfectly in LaTeX, using a simple
"pdflatex thesis.tex", including all the tricks I pulled there to control BNF
productions etc. I'm certain that 14 years from now, I won't be able to
re-generate the YAML spec on some UNIX platform by simply running "make". Even
assuming I fix the catalog, and install XEP, my patches are probably too
version-dependent."

Ruth

-----Original Message-----
From: Oren Ben-Kiki [mailto:oren@ben-kiki.org] 
Sent: Tuesday, February 15, 2005 8:23 AM
To: Ruth Ivimey-Cook
Subject: Re: YAML spec request

Hi Ruth,

The YAML spec is written in DocBook, but not "vanilla" DocBook. To get an idea,
here's what the Makefile rules for the spec look like:

spec.pdf: spec.dbk \
          preprocess_fo.pl preprocess_ps.sed catalog docbook_xslt \
          $(EPS_IMAGES) Render-X-license.txt
 $(XSLTPROC) preprocess_fo.xsl spec.dbk > tmp1.xml
 $(XSLTPROC) single_fo.xsl tmp1.xml > tmp2.xml  perl preprocess_fo.pl tmp2.xml
> tmp3.xml
 $(XEP) tmp3.xml -ps tmp3.ps
 sed -f preprocess_ps.sed tmp3.ps > spec.ps  ps2pdf spec.ps  rm tmp1.xml
tmp2.xml tmp3.xml tmp3.ps

spec.html: spec.dbk \
           preprocess_png.sed preprocess_html.pl catalog docbook_xslt \
           $(PNG_IMAGES)
 perl verify_terms.pl
 sed -f preprocess_png.sed spec.dbk > tmp1.xml
 $(XSLTPROC) preprocess_html.xsl tmp1.xml > tmp2.xml
 $(XSLTPROC) single_html.xsl tmp2.xml > tmp3.xml  perl preprocess_html.pl
tmp3.xml > spec.html  rm tmp1.xml tmp2.xml tmp3.xml

In a word, UGLY. I even patch the PostScript files (to get dot-dashed borders,
which don't exists in DocBook). Don't let this scare you off using DocBook -
you can get pretty good results without such craziness.

For BNF productions, my main problem is aligning continuation lines. To make
this painless, the preprocess_{fo,html}.xsl transformations include the
following magic:

  <!-- Handle line break in productions -->
  <xsl:template match="sbr"
    ><sbr
    /><xsl:if test="ancestor-or-self::rhs"
      ><xsl:call-template name="nbsps"
        ><xsl:with-param name="number"
            select="string-length(../../lhs) + 4" 
      /></xsl:call-template
    ></xsl:if
  ></xsl:template>

  <!-- Emit a number of non-breaking spaces. -->
  <xsl:template name="nbsps"
    ><xsl:param name="number"
    /><xsl:if test="$number &gt; 0"
      ><xsl:text>&#160;</xsl:text
      ><xsl:call-template name="nbsps"
        ><xsl:with-param name="number" select="$number - 1"
      /></xsl:call-template
    ></xsl:if
  ></xsl:template>

This automatically inserts the correct number of &nbsp; at the start of each
continuation line. In the source itself I write something like the
following:

  <production id="s-l+block-simple-value(n)">
    <lhs>s-l+block-simple-value(n)</lhs>
    <rhs>
      &nbsp;&nbsp;<nonterminal def="#s-l+block-node(n,c)"
        >s-l+block-node(n,block-out)</nonterminal><sbr/>
      | <nonterminal def="#s-l-empty-block"/>
    </rhs>
  </production>

I also patch the HTML results, since DocBook isn't nice about HTML classes.
You'd expect it to tag each HTML element with a class saying which DocBook
element it corresponds to, so CSS modifications of look and feel would be easy.
It doesn't. So I do the following (amongst other hacks):

  $line =~ s/width="3%"/class="productioncounter"/g;
  $line =~ s/width="10%"/class="productionlhs"/g;
  $line =~ s/width="5%"/class="productionseperator"/g;
  $line =~ s/width="52%"/class="productionrhs"/g;
  $line =~ s/width="30%"/class="productioncomment"/g;

This removes the hard-wired column widths in HTML, making them automatic, and
lets me configure each column as I want in CSS. In FO, there's no way to do
automatic-width columns, so I just live with it.

I suppose someone with much more time on his hands could achieve most of this
by hacking the DocBook FO and HTML XSLT stylesheets, and submitting his fixes
to the distribution. In fact I do some patches by overriding the default XSLT
templates inside single_fo.xsl and single_html.xsl.

I hope this helps,

 Oren Ben-Kiki

-----Original Message-----
From: Oren Ben-Kiki [mailto:oren@ben-kiki.org] 
Sent: Tuesday, February 15, 2005 5:02 PM
To: Ruth Ivimey-Cook
Subject: Re: YAML spec request

> Thanks for your detailed reply. I must admit I am quite new to DocBook 
> and XSLT, and what you're writing seems rather complicated.

That isn't even the half of it. My various patch scripts handle lots of stuff.
Some of these things are due to missing DocBook functionality, some are because
DocBook's way of doing things is _very_ inconvenient to author, and some are
because the FO stylesheets don't offer enough customization ability.

> Would you mind if I echoed your mail to the docbook mailing list for 
> comments (or, if you prefer, could you do it)?

No, go right ahead. I'm not subscribed to the docbook mailing list. As a
docbook user, here's the list of things I had to patch, and my take on them.
Feel free to post it:

Quoting:

<uquote>stuff</uquote> => <quote><userinput>stuff</userinput></quote>
<q>stuff</q> quotes stuff using left and right single quotes (I guess I should
have called it <squote>). Both of these are very useful. I wish DocBook had
them.

I add Symbols and ZapfDingbats to everywhere the monospace font is used (e.g.,
<uquote>), otherwise I get all sort of characters as black squares. It is so
much easier to write <uquote>a&rarr;b</uquote> then
<quote><userinput>a</userinput>&rarr;<userinput>b</userinput></quote>...

Indexing. I use:

<defterm primary="term" secondary="optional">stuff</defterm>
<refterm primary="term" secondary="optional">stuff</refterm>

And make sure that:
- The 'defterm' is italicized in the text itself and in the index.
- In the HTML, refterm is a link to the defterm, defterm is a link to the index
entry.

I collapse white space in index term names (they somehow sneak into them and
are preserved, which makes them look horrible).

I insert some line breaks before specific index entries (in the index
page) to ensure that the name of an index entry isn't on a different column
from the page numbers (for a one-line list of numbers! FO is no TeX, that's for
sure).

The built-in DocBook indexing mechanism drove me nuts. It is SO verbose... and
there's so little functionality (except for ranges, which I don't use). This
took serious hacking, but it was worth it.

BNF:

<sbr> automatically adds leading spaces in BNF productions, to align
continuation lines. In HTML, I also emit each BNF production in a table of its
own, with automatic column widths to limit the absurd space wasting the default
behavior causes. BNF layout is a PITA in every typesetting system I know (TeX
included).

Examples:

I use <screen> for "error" examples and <programlisting> for valid ones. 
There are no <validinput> <validoutput> and <badinput> <badoutput> elements...

I have 4 commands: <hl1> to <hl4> that I can use inside screen/programlisting
to highlight a section of the text (color in HTML, border in PDF). This
requires me to actually patch the PS files to properly control the 4 different
border styles. I also allow one-level nesting of these elements (with extra
padding so the inclusion is clear in the example).

You'd that being able to highlight a piece of text in an example is an obvious
feature... And no, on-the-side annotations aren't a good enough solution. Not
for describing a syntax, anyway.

Presentation:

I have a <keeptogether>stuff</keeptogether> command that prevents stuff from
being broken between pages. OK, a "presentation" issue, but sometimes you
_really_ want to keep stuff together, and the FO doesn't do a good job at it.
Besides, even TeX, which does an excellent job, allows you to override its
default page break algorithm.

I hack some of the FO output to get control over margins, alignment, and
padding that I can't otherwise control. Ideally every value (padding, margin,
border, etc.) should be accessible as parameters. Yes, this would mean that the
FO parameters list would be about 10 or 100 times longer. So?

In HTML, I had to hack the results to assign different classes to different
elements so that CSS would be able to control them (e.g., the BNF table
columns). Again, DocBook should have emitted all these classes - there's no
reason not to.

Figures:

I change the names of included figures from .eps (when using FO to generate
PDF) to .png (when generating HTML).

Toolchain:

I use an XML catalog for locating the docbook stylesheets. A different one for
each platform of course, since they are never in the same place in to different
systems. Sigh.

Also, setting up the toolchain for FO was a PITA. We ended up using XEP (which
provides some PDF creation extensions such as a true TOC).

Other than the above small issues, DocBook is great. I have one source that I
generate high-quality HTML and PDF from. I do wish it was less clunky and that
I could have made my changes within the DocBook environment.

My main problem with DocBook is that if you want to excert any control over the
presentation, you have to step "outside the system". My thesis from 1992 still
renders perfectly in LaTeX, using a simple "pdflatex thesis.tex", including all
the tricks I pulled there to control BNF productions etc. I'm certain that 14
years from now, I won't be able to re-generate the YAML spec on some UNIX
platform by simply running "make". Even assuming I fix the catalog, and install
XEP, my patches are probably too version-dependent.

Have fun,

 Oren Ben-Kiki




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]