OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [docbook-apps] HTML to Docbook

A few links:

Converting HTML to Docbook SGML/XML Using html2db
"html2db is a small utility to convert HTML to Docbook SGML/XML. It uses
TidyLib for parsing the HTML."

"html2db.xsl converts an XHTML source document into a Docbook output
document. It provides features for customizing the generation of the
output, so that the output can be tuned by annotating the source, rather
than hand-editing the output."

"This project was created by JeffBeal. He has been working with DocBook
since November, 2001, and has so far converted three sets of project
documentation from HTML to DocBook. Due to inconsistencies in HTML
coding and the often many-to-one relationship between DocBook elements
and HTML elements, there has always been a need to review and re-tag
manually, but the following process does minimize that effort somewhat."

The Tidy patch has not been maintained for a while. It worked quite well
for me some years ago. If someone decides to go along with this, you
should start with tidy source from March 2003.

Please report your findings!

Kind regards
Peter Ring

For DocBook tools in general, always start here:


and here

  Docbook tools

> -----Original Message-----
> From: Thomas Jones [mailto:admin@buddhalinux.org]
> Sent: 4. marts 2005 04:16
> To: docbook-apps@lists.oasis-open.org
> Subject: [docbook-apps] HTML to Docbook
> Does anyone know of a utility to convert HTML to Docbook?
> I found a few mentioned utilities via google but none are 
> available, and/or 
> have not been maintained since 2000.
> I thought of building a stylesheet; but what an undertaking!
> ;)
> Thanks,
> Thomas

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]