[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: FW: YAML spec request
Hi, When I asked about EBNF things on the list, it was suggested I talk to the writers of the YAML spec. I did, and Oren wrote back recently. The following is his first message, and then his reply to my request to put it on the list. I thought it would be good to get some "views". The first message is aimed at how his EBNF mechanism actually works. The second is a more general description of the changes he's made to DocBook for his authoring work, and why he wanted them. Note particularly (from the end): "My main problem with DocBook is that if you want to excert any control over the presentation, you have to step "outside the system". My thesis from 1992 still renders perfectly in LaTeX, using a simple "pdflatex thesis.tex", including all the tricks I pulled there to control BNF productions etc. I'm certain that 14 years from now, I won't be able to re-generate the YAML spec on some UNIX platform by simply running "make". Even assuming I fix the catalog, and install XEP, my patches are probably too version-dependent." Ruth -----Original Message----- From: Oren Ben-Kiki [mailto:oren@ben-kiki.org] Sent: Tuesday, February 15, 2005 8:23 AM To: Ruth Ivimey-Cook Subject: Re: YAML spec request Hi Ruth, The YAML spec is written in DocBook, but not "vanilla" DocBook. To get an idea, here's what the Makefile rules for the spec look like: spec.pdf: spec.dbk \ preprocess_fo.pl preprocess_ps.sed catalog docbook_xslt \ $(EPS_IMAGES) Render-X-license.txt $(XSLTPROC) preprocess_fo.xsl spec.dbk > tmp1.xml $(XSLTPROC) single_fo.xsl tmp1.xml > tmp2.xml perl preprocess_fo.pl tmp2.xml > tmp3.xml $(XEP) tmp3.xml -ps tmp3.ps sed -f preprocess_ps.sed tmp3.ps > spec.ps ps2pdf spec.ps rm tmp1.xml tmp2.xml tmp3.xml tmp3.ps spec.html: spec.dbk \ preprocess_png.sed preprocess_html.pl catalog docbook_xslt \ $(PNG_IMAGES) perl verify_terms.pl sed -f preprocess_png.sed spec.dbk > tmp1.xml $(XSLTPROC) preprocess_html.xsl tmp1.xml > tmp2.xml $(XSLTPROC) single_html.xsl tmp2.xml > tmp3.xml perl preprocess_html.pl tmp3.xml > spec.html rm tmp1.xml tmp2.xml tmp3.xml In a word, UGLY. I even patch the PostScript files (to get dot-dashed borders, which don't exists in DocBook). Don't let this scare you off using DocBook - you can get pretty good results without such craziness. For BNF productions, my main problem is aligning continuation lines. To make this painless, the preprocess_{fo,html}.xsl transformations include the following magic: <!-- Handle line break in productions --> <xsl:template match="sbr" ><sbr /><xsl:if test="ancestor-or-self::rhs" ><xsl:call-template name="nbsps" ><xsl:with-param name="number" select="string-length(../../lhs) + 4" /></xsl:call-template ></xsl:if ></xsl:template> <!-- Emit a number of non-breaking spaces. --> <xsl:template name="nbsps" ><xsl:param name="number" /><xsl:if test="$number > 0" ><xsl:text> </xsl:text ><xsl:call-template name="nbsps" ><xsl:with-param name="number" select="$number - 1" /></xsl:call-template ></xsl:if ></xsl:template> This automatically inserts the correct number of at the start of each continuation line. In the source itself I write something like the following: <production id="s-l+block-simple-value(n)"> <lhs>s-l+block-simple-value(n)</lhs> <rhs> <nonterminal def="#s-l+block-node(n,c)" >s-l+block-node(n,block-out)</nonterminal><sbr/> | <nonterminal def="#s-l-empty-block"/> </rhs> </production> I also patch the HTML results, since DocBook isn't nice about HTML classes. You'd expect it to tag each HTML element with a class saying which DocBook element it corresponds to, so CSS modifications of look and feel would be easy. It doesn't. So I do the following (amongst other hacks): $line =~ s/width="3%"/class="productioncounter"/g; $line =~ s/width="10%"/class="productionlhs"/g; $line =~ s/width="5%"/class="productionseperator"/g; $line =~ s/width="52%"/class="productionrhs"/g; $line =~ s/width="30%"/class="productioncomment"/g; This removes the hard-wired column widths in HTML, making them automatic, and lets me configure each column as I want in CSS. In FO, there's no way to do automatic-width columns, so I just live with it. I suppose someone with much more time on his hands could achieve most of this by hacking the DocBook FO and HTML XSLT stylesheets, and submitting his fixes to the distribution. In fact I do some patches by overriding the default XSLT templates inside single_fo.xsl and single_html.xsl. I hope this helps, Oren Ben-Kiki -----Original Message----- From: Oren Ben-Kiki [mailto:oren@ben-kiki.org] Sent: Tuesday, February 15, 2005 5:02 PM To: Ruth Ivimey-Cook Subject: Re: YAML spec request > Thanks for your detailed reply. I must admit I am quite new to DocBook > and XSLT, and what you're writing seems rather complicated. That isn't even the half of it. My various patch scripts handle lots of stuff. Some of these things are due to missing DocBook functionality, some are because DocBook's way of doing things is _very_ inconvenient to author, and some are because the FO stylesheets don't offer enough customization ability. > Would you mind if I echoed your mail to the docbook mailing list for > comments (or, if you prefer, could you do it)? No, go right ahead. I'm not subscribed to the docbook mailing list. As a docbook user, here's the list of things I had to patch, and my take on them. Feel free to post it: Quoting: <uquote>stuff</uquote> => <quote><userinput>stuff</userinput></quote> <q>stuff</q> quotes stuff using left and right single quotes (I guess I should have called it <squote>). Both of these are very useful. I wish DocBook had them. I add Symbols and ZapfDingbats to everywhere the monospace font is used (e.g., <uquote>), otherwise I get all sort of characters as black squares. It is so much easier to write <uquote>a→b</uquote> then <quote><userinput>a</userinput>→<userinput>b</userinput></quote>... Indexing. I use: <defterm primary="term" secondary="optional">stuff</defterm> <refterm primary="term" secondary="optional">stuff</refterm> And make sure that: - The 'defterm' is italicized in the text itself and in the index. - In the HTML, refterm is a link to the defterm, defterm is a link to the index entry. I collapse white space in index term names (they somehow sneak into them and are preserved, which makes them look horrible). I insert some line breaks before specific index entries (in the index page) to ensure that the name of an index entry isn't on a different column from the page numbers (for a one-line list of numbers! FO is no TeX, that's for sure). The built-in DocBook indexing mechanism drove me nuts. It is SO verbose... and there's so little functionality (except for ranges, which I don't use). This took serious hacking, but it was worth it. BNF: <sbr> automatically adds leading spaces in BNF productions, to align continuation lines. In HTML, I also emit each BNF production in a table of its own, with automatic column widths to limit the absurd space wasting the default behavior causes. BNF layout is a PITA in every typesetting system I know (TeX included). Examples: I use <screen> for "error" examples and <programlisting> for valid ones. There are no <validinput> <validoutput> and <badinput> <badoutput> elements... I have 4 commands: <hl1> to <hl4> that I can use inside screen/programlisting to highlight a section of the text (color in HTML, border in PDF). This requires me to actually patch the PS files to properly control the 4 different border styles. I also allow one-level nesting of these elements (with extra padding so the inclusion is clear in the example). You'd that being able to highlight a piece of text in an example is an obvious feature... And no, on-the-side annotations aren't a good enough solution. Not for describing a syntax, anyway. Presentation: I have a <keeptogether>stuff</keeptogether> command that prevents stuff from being broken between pages. OK, a "presentation" issue, but sometimes you _really_ want to keep stuff together, and the FO doesn't do a good job at it. Besides, even TeX, which does an excellent job, allows you to override its default page break algorithm. I hack some of the FO output to get control over margins, alignment, and padding that I can't otherwise control. Ideally every value (padding, margin, border, etc.) should be accessible as parameters. Yes, this would mean that the FO parameters list would be about 10 or 100 times longer. So? In HTML, I had to hack the results to assign different classes to different elements so that CSS would be able to control them (e.g., the BNF table columns). Again, DocBook should have emitted all these classes - there's no reason not to. Figures: I change the names of included figures from .eps (when using FO to generate PDF) to .png (when generating HTML). Toolchain: I use an XML catalog for locating the docbook stylesheets. A different one for each platform of course, since they are never in the same place in to different systems. Sigh. Also, setting up the toolchain for FO was a PITA. We ended up using XEP (which provides some PDF creation extensions such as a true TOC). Other than the above small issues, DocBook is great. I have one source that I generate high-quality HTML and PDF from. I do wish it was less clunky and that I could have made my changes within the DocBook environment. My main problem with DocBook is that if you want to excert any control over the presentation, you have to step "outside the system". My thesis from 1992 still renders perfectly in LaTeX, using a simple "pdflatex thesis.tex", including all the tricks I pulled there to control BNF productions etc. I'm certain that 14 years from now, I won't be able to re-generate the YAML spec on some UNIX platform by simply running "make". Even assuming I fix the catalog, and install XEP, my patches are probably too version-dependent. Have fun, Oren Ben-Kiki
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]