OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] Webhelp: My adventures therein


Hi Mary,

thanks a lot for sharing your experiences! It was a really interesting
read for me, especially since we also ran into some of the same issues
when customizing the webhelp output.

If you can, please share your customizations, I'd be really interested
in learning from them.

(Also, a huge thanks to Maxime Bégnis for sharing the fast-chunking
webhelp stylesheets! They were a real life-saver for us.)

Kind Regards,

Robert Fekete

On Tue, Sep 9, 2014 at 2:45 AM, Mary Tabasko <tabasko@telerama.com> wrote:
> Hi, all.
>
> Here is the promised write-up of my adventures with Webhelp. It is long,
> so if you don't care, don't bother reading any further! But I hope some
> of you find it helpful. I apologize for the length, but this way, at least
> it's one big message for those who don't care, not a bunch of irrelevant
> little ones.
>
> -- Mary
>
>
> Background:
>
>   For one of our products, we have a doc set that currently consists of
>   three PDFs and two Microsoft Help CHM files.
>
>   PDFs:
>     Admin guide: 679 pges (13.1 MB)
>     User Guide:  792 pages (61.6 MB)
>     What's New: 36 pages (2 MB)
>
>   CHMs:
>     Admin Helpset: 72.5 MB (includes content of
>          Admin and User Guides, and What's New)
>     User Helpset: 62.9 MB (includes content of User Guide)
>
>   We have been have had issues with the ancient MSHelp compiler
>   over the ages, and have been getting increasingly worried about
>   its continued viability. It does some strange things on 64-bit
>   systems. So we have been looking to replace it.
>
>   These documents (and many more) are all built using a homegrown
>   toolchain. The documents are mostly written in DocBook (v. 4.4) and
>   converted into various formats using the DocBook stylesheets and
>   customizations. (Some are written in other XML that we convert to
>   DocBook 4.4 using some combination of Perl and XSL.)
>
>   We use Ant, XSLTproc, XEP, Perl, and various other tools to build our
>   docs on both local "development" systems (desktops) and on our
>   build system, with nightly and on-demand builds. We have an entire
>   set of XSL stylesheets that customize the DocBook stylesheets for
>   our "corporate" and "product" styles, and then each project may have
>   a project-specifc stylesheet that tweaks the corporate ones. So a
>   project's stylesheet may import a corporate stylesheet, which in turn
>   imports the DocBook ones. Or a project sheet may go straight to DocBook XSL.
>
>   Due to corporate restrictions, it is generally not easy to upgrade
>   things, so we tend to not bother unless we really have to. As a
>   result, we had been using DocBook 4.4 and DocBook XSL 1.74.3 for
>   ages.
>
>   While researching options to replace the MSHelp format, we found
>   nothing that was both suitable and corporately allowable until we
>   noticed that Oxygen (one of the XML editors we have in-house)
>   had a "help" format that looked intriguing. After digging into
>   it, we discovered that it was based on the webhelp transforms
>   in DocBook XSL 1.76.0. Based on some experiments with the stylsheets
>   in Oxygen, we bit the bullet to get the latest and greatest
>   DocBook XSL release. The format looked like it would do a lot
>   of what we wanted, and it was based on the already-established
>   toolchain, so we wouldn't have corporate issues. Could
>   make it do what we wanted?
>
>   We were eventually able to create webhelp docsets that we could
>   use to replace our CHM archives, but it was non-trivial. The
>   rest of this describes some of the issues we encountered and how
>   we addressed, or didn't address, them. But without the DocBook XSL,
>   we would have been SOL. :) So thank you all again for this wonderful
>   resource!
>
>   Dramatis Personae
>     DocBook 4.4 DTD
>     Docbook XSL 1.78.1
>     XSLTProc using libxml2 2.7.3; libxslt 1.1.24
>     Xalan (for indexing): Xalan-J 2.7.1
>     Perl, Ant, homegrown XSL, and other supporting players
>
>
> Issue with the "Content-Type" meta element.
>
>   A "meta" element for "Content-Type" is written into each
>   of our HTML documents; it has the form of an "open tag":
>   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">,
>   but it is stand-alone.
>
>   The search indexer balks at this (and any other unclosed tags), and
>   indexing fails. Changing the element to <meta ... /> solves
>   the problem. I haven't been able to figure out where
>   this comes from in the XSL transforms, so I was not able
>   to use XSLT to fix it. (This may be an artifact of some of our
>   out-of-date tools.)
>
>   I ended up writing a trivial Perl script that would be
>   run on all the generated HTML files before the search-indexing
>   step, to change <meta ...> into <meta .../>. Inelegant, but
>   effective. This turned out to be really useful later....
>
> Issues with the sidebar TOC.
>
>   The generation of the sidebar TOC for each HTML page bogs down
>   the processing on large documents.
>
>   Generating the HTML for our old HTMLHelp format takes less than
>   2 minutes on our largest doc. When I ran that doc through the
>   Webhelp transform, it OOMed after 6 hours. I noticed that the
>   default chunking level was much higher than what we used for
>   HTMLHelp and wondered if that might be part of the problem. When
>   I changed it, the processing completed successfully in about 2 hours.
>   (That would still be a show-stopper for our nightly builds.)
>
>   But that the time it took to process was so strongly related to
>   the number of files it was creating made me suspect the sidebar
>   TOC was the culprit. (I have to admit that it never occurred to
>   me to look for a bug report. I didn't find that until much later!)
>
>   It took some investigation to determine that the TOC generation
>   was indeed the problem, but once I narrowed it down, I split the
>   HTML generation into two steps. I lifted the template that
>   generates the sidebar TOC into a separate stylesheet, and
>   pre-generated a single file containing the sidebar TOC
>   (the <ul id="filetree"> list) as a preliminary step.
>
>   When generating the chunked HTML, instead of regenerating the
>   TOC for each file, we simply read in the pre-generated file.
>
>   Two issues with this:
>
>   1. I needed to use the "generate.consistent.ids"
>      parameter to keep the generated IDs in sync between
>      generating the sidebar TOC and the standard HTML. I had
>      never encountered that parameter before; I was worried I would
>      have to solve this myself, so yay again for the stylesheets!
>      (These generated IDs caused another issue, though, described later.)
>
>   2. Since the TOC was pre-generated once, we lost the insertion
>      of the "webhelp-currentid" attribute for each file. We were
>      willing to take that loss if necessary, especially given
>      that the ToC doesn't "stick" (bug 1226, which we did not attempt
>      to address). But it wasn't.
>
>      Since I already had a Perl script that would be run on all the
>      generated HTML (to fix the "meta" element mentioned above),
>      it was trivial to add a step to reinstate the "webhelp-currentid"
>      attribute at the right place in each file.
>
>   Handling the sidebar TOC this way kept the processing time to
>   under 2 minutes with no loss of functionality. I realize that this
>   is NOT a general solution and probably not suitable for everyone, but
>   given our build environment and the tools we have available, this was
>   expedient and fit into our "ecosystem" just fine.
>
>   This doesn't address the issue of embedding this TOC in every file.
>   (I hadn't seen the proposed solution noted in bug 1259 before implementing
>   my solution, and I'm not sure I'd be allowed to just download it
>   (corporate policy, esp. since it includes more JavaScript).
>
>   We are seeing some issues with the "expand/collapse" indicators on the
>   sidebar TOC. The "treeview" JavaScript inserts "class" attributes
>   with values like "collapsable" and "expandable" to indicate the
>   state of the TOc entry (embedded lists). We often see expanded
>   lists given the attribute "expandable" rather than "collapsable",
>   which means that the "rollup" indicators are incorrect. This seems to
>   happen mostly with pointers to sections inside pages, so I suspect
>   that this is an interplay between the chunking level and the
>   "treeview" JavaScript. (I suspect that it doesn't happen if each link
>   goes to a separate page (or at least that no page contains more than
>   one level of expandable sections). I tried to run this down
>   to the source (the stylesheets only provide the minified JS library),
>   but it looks like this library went out of support in 2010 and is
>   no longer being maintained. (Because of corporate policies, I can't
>   casually download the original JS library.) Since this affects only
>   the visual collapse/expand indicators, not the functionality, we are
>   willing to live with it for now.
>
> Issues with links to local (within-page) IDs.
>
>   We noted that within-page links did not work. We found the
>   messages on the docbook-apps list about this, and tried
>   commenting out the salient block in the "main.js" file. This
>   fixed the problem for most links within a page (those within "content".
>   (We tried using the fix in the later snapshot, but we didn't see any
>   difference.)
>
>   We also noted another problem with generated links from the sidebar TOC.
>   If you were on a page like, say, "bk01.html" and tried to navigate
>   to "bk02ch01s04#id-4.1.3.4.6" (a totally made-up id value, but
>   the format is what we got), the correct page and local link would load
>   (that is, the new page would be scrolled to the local link), but the
>   sidebar disappeared, and the sidebar toggle would not bring it back.
>
>   (Clicking the Next link followed by the Previous link would restore
>   it, but the direct navigation from the sidebar TOC always clobbered the
>   sidebar.)
>
>   The problem only occurred with generated IDs. Navigating from the
>   sidebar TOC on "bk01.html" to "bk02ch01s04#using-passwords" worked
>   fine. Looking at the gross structure of the links in the sidebar
>   TOC revealed no differences. The difference had to be in the structure
>   of the values of the IDs.
>
>   By default, the "object.id" template with "generate.consistent.ids"
>   set makes values like "id-4.2.6.3". I played around with these values
>   a bit and determined that changing the "dots" to "dashes" solved the
>   problem. That is, links with id values like "id-4-2-6-3" worked just
>   fine. (The original ids work fine within the content block; it's only
>   using them from the sidebar TOC that causes the problem.
>
>   I could find no way to tell the "generate-id" function to alter this
>   structure, so I had to override "object.id" and do it myself. (The
>   problem appears to be in some piece of JavaScript, but I have not
>   attempted to find it. The browser follows the links fine.)
>
>   For completeness, I put "." characters into a couple of our explicitly
>   provided IDs and the links to them. They then exhibit the same problem:
>   the sidebar does not appear when you traverse to such an ID. (This
>   was not a browser-specific problem, either.)
>
>   Note: Unless you have "." in your explicit IDs or have set
>   "generate.consistent.ids" for some other reason, this issue wouldn't affect
>   anyone who didn't generate the sidebar TOC separately like we did.
>
> Issues with styling and layout.
>
>   The webhelp XSL templates provide some customization mechanisms, but
>   we found that we often needed to override pieces that provided no
>   handy hooks. And having our CSS file as the first one in the doc
>   header meant that it was constantly fighting with the "built-in"
>   stylesheets ("positioning.css", the Jquery stylesheets, and the
>   CSS elements embedded right into the pages). There were some CSS
>   items we could not figure out how to override using just our stylesheet.
>
>   We spent a lot of hours simply trying to figure out where some bit of
>   styling was coming from, and then more time trying to figure out how
>   to override it. I eventually decided that trying to work around that
>   huge block of CSS imports, JavaScript, and embedded CSS in every page
>   wasn't worth the effort.
>
>   In the end, I ended up taking apart the "user.head.content" template
>   in "webhelp-common.xsl" and refactoring it. I tried to use only the
>   customization hooks that were provided, but I just couldn't do it. :)
>
>   I broke "user.head.content" into several smaller templates (one to insert
>   CSS imports, one to insert JavaScript, etc) and reimplemented the
>   original template simply to call the other templates. That way, I could
>   selectively override the parts I wanted/needed to. I could then easily
>   import our stylesheet last, which let me move all of the CSS elements that
>   were being embedded into each page into the our CSS instead (and change
>   them).
>
>   This made styling the documents MUCH easier. It also meant that
>   I didn't have a big blob of CSS repeated in every HTML page.
>
>   We wanted to change the layout of the items in the header, like the
>   nav bar, to be consistent with other collateral we have. MOre overrides.
>   I also found it necessary to parameterize some of the other templates
>   (like "user.header.content") called from "chunk-element-content".
>   I ended up overriding a LOT of stuff. Again, these changes are probably
>   not ideal as general approaches (though I think breaking up some of the
>   big templates and refactoring them, and maybe adding more parameterized
>   customization hooks, are, but most of my fixes were geared toward solving
>   my specific problem in my specific environment.
>
>   I also found that we had to alter some of the colors embedded into
>   "main.js" to get the effects we wanted. I really didn't want to have
>   to change "main.js", but we couldn't find any other way to
>   get the changes we wanted. (This was before we discovered the
>   local-link issue that required us to change the file anyway.)
>
>   There was no elegant way to override some of the JQuery styling,
>   particularly replacing images. We simply had to replace their image
>   files (keeping the names) with our images, since the JavaScript
>   is responsible for getting the images in. We created a "customization
>   template" (a directory with the same structure as the template, but with
>   our project-specific variants (images, main.js) in it, and we simply
>   slapped this on top of the template from the stylesheets when building
>   the docs.
>
>   The one thing that really drove us crazy was the fact that we could not
>   figure out how to change the size of that header. We tried a bunch
>   of different things, and in the end, we just dealt with what we had.
>   I'm sure it's in that JQuery UI Layout stuff, but none of us was familar
>   with that package, and we just didn't have the time to try to sort it out.
>
> I would be more than happy to share the customizations we made to the
> stylesheets, my Perl script and so on, if anyone is interested in seeing
> them. Like I've said, my solutions are probably NOT general-purpose
> solutions, but they worked for us and may be helpful to some of you.
> I can also send a screen-shot of what our final output looks like.
> I don't want to send this out generally, since I suspect most readers
> of this list are not interested.
>
>                                      -- 30 --
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]