[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] A little XML-to-XML handholding?
Thanks for your help, everyone. I need to brush up on my DocBook before I reply in real detail. It's been eons. I did know DocBook quite well back in the day, but at the time was not happy with the available tools. DocBook itself I think is just dandy, but the tools I was using then were a real PITA. Camille, I'm afraid mine is quite a low budget operation. However, I'm contemplating using a KickStarter Campaign to finance an initial print run of at least one of my books. If I do that, I expect I could afford to pay for a license for the proprietary version of your tool. It's been a long time, but I was at one time intimately familiar with the Apache Xerces-C (actually C++) XML DOM API. One approach that I could conceivably take, would be to write a C++ program, that would use Xerces-C to read my essays one-at-a-time into their own DOM, then copy the contents of the XHTML elements into the corresponding DocBook 5 XML elements. For <p> to <Para> that would be straightforward, but I haven't looked into the other kinds of elements yet, or attributes. I just now installed some of the DocBook packages on my Mountain Lion MacBook Pro with MacPorts, however the docbook-utils package would not install, no doubt due to some configuration bug in its port file. I'll report that via the MacPorts trouble ticket procedure. Best, Mike Crawford firstname.lastname@example.org http://www.warplife.com/ On Mon, Jul 29, 2013 at 6:25 PM, Richard Hamilton <email@example.com> wrote: > Hi Mike, > > I have had very good luck with Herold (http://www.michael-a-fuchs.de). > > I'm usually not fortunate enough to have strict xhtml, so we do some pre-processing (usually on well-behaved, but idiosyncratic, html), tidy it up into xhtml, then run Herold. > > You may find that you need to do some light pre- or post-processing, but for us it has never been more than a short XSL stylesheet to do things like remove empty paragraphs from the initial XHTML or change the root element in the resulting DocBook (the latter can probably be handled by Herold using Groovy scripts, but I've learning all the scripting languages I need for the time being, so I stick with XSL or Perl-:). > > When we build a book, like you're doing, rather than concatenate pieces, we keep each file separate, then create a "book" file that uses xinclude to pull in the chapters. That simplifies the scripting and makes it easier to move parts around in the book. > > Regarding the killer feature, if you use the right option (I don't remember off-hand, but it's in Bob Stayton's book (http://sagehill.net)), you can get exactly what you want for links in the hard copy. > > Best Regards, > Dick Hamilton > ------- > XML Press > XML for Technical Communicators > http://xmlpress.net > firstname.lastname@example.org > > > > On Jul 27, 2013, at 6:18 PM, Michael Crawford wrote: > >> Greetings, Earthlings, >> >> I have some articles and essays that are all marked up with valid XHTML 1.0 Strict with CSS, that I would like to publish as bound, dead-tree books, possibly also eBooks. >> >> It seems to me that the best way to do that would be to convert each collection of essays into a single DocBook XML document. Can you give me some tips on how to get started? I'm happy to Read The Fine Manual, but there are so many. >> >> One such volume, when printed both-sides on US Letter paper, is ~250 pages. The essays range from two to fifty pages. >> >> What I _think_ I need to do is to use some manner of XML-to-XML transformation, to strip everything from the beginning of each document, up to and including the opening <body>, then from the closing </body>, to the end of each document.... >> >> ... then concatenate them all together, with each present XHTML document being a single chapter in the resulting DocBook document... >> >> ... then replace HTML-style tags and attributes with DocBook-style: <p> to <Para>, for example... >> >> ... what would be for me, A Killer Feature, would be to convert each HTML <a href="..."> hyperlink into a DocBook footnote. So where I have this: >> >> =========== >> a long-forgotten <a href="http://www.kuro5hin.org/">cesspool</a> in a far-off corner of the World-Wide Web... >> =========== >> would look something like this in hardcopy form: >> >> a long-forgotten cesspool in a far-off corner of the World-Wide Web... >> ---- >> 1. http://www.kuro5hin.org/ >> >> ========= >> >> I'd also like to design my own custom stylesheets. I'll ask about that later though. I have a copy of "Android Programming: The Big Nerd Ranch Guide" by Bill Phillips and Brian Hardy. In the Acknowledgements, the authors credit Chris Loper of http://www.intelligentenglish.com/ for his DocBook toolchain. >> >> That volume is exquisite. I'd like to design my own volume, not to look the same, but to look as good, with my own personal style. >> >> Thanks for any advice you can give me. >> >> Mike Crawford >> email@example.com >> http://www.warplife.com/ >> >> >