Subject: A little XML-to-XML handholding?

Greetings, Earthlings,

I have some articles and essays that are all marked up with valid XHTML 1.0 Strict with CSS, that I would like to publish as bound, dead-tree books, possibly also eBooks.

It seems to me that the best way to do that would be to convert each collection of essays into a single DocBook XML document.  Can you give me some tips on how to get started?  I'm happy to Read The Fine Manual, but there are so many.

One such volume, when printed both-sides on US Letter paper, is ~250 pages.  The essays range from two to fifty pages.

What I _think_ I need to do is to use some manner of XML-to-XML transformation, to strip everything from the beginning of each document, up to and including the opening <body>, then from the closing </body>, to the end of each document....

... then concatenate them all together, with each present XHTML document being a single chapter in the resulting DocBook document...

... then replace HTML-style tags and attributes with DocBook-style: <p> to <Para>, for example...

... what would be for me, A Killer Feature, would be to convert each HTML <a href="" hyperlink into a DocBook footnote.  So where I have this:

a long-forgotten <a href="" href="http://www.kuro5hin.org/">http://www.kuro5hin.org/">cesspool</a> in a far-off corner of the World-Wide Web...
would look something like this in hardcopy form:

a long-forgotten cesspool[1] in a far-off corner of the World-Wide Web...
1. http://www.kuro5hin.org/


I'd also like to design my own custom stylesheets.  I'll ask about that later though.  I have a copy of "Android Programming: The Big Nerd Ranch Guide" by Bill Phillips and Brian Hardy.  In the Acknowledgements, the authors credit Chris Loper of http://www.intelligentenglish.com/ for his DocBook toolchain.

That volume is exquisite.  I'd like to design my own volume, not to look the same, but to look as good, with my own personal style.

Thanks for any advice you can give me.

Mike Crawford

