OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: RE: DOCBOOK-APPS: How to translate HTML to DocBook


Dave Brooks wrote...
>At 12:53 19/03/2002 +1100, Andrew Westcombe wrote:
>>At 05:00 PM 12/03/2002 -0600, Patrick Hartling wrote:
>>
>>>  It also helps if the source is "good" HTML.  Having closing tags such 
>>> as </li>, </p>, and </br> helps immensely.
>>
>>I've used DocParse myself, it's not bad, and very good value. As for 
>>having "good" HTML, Dreamweaver has a very nice command for stripping out 
>>junk, esp. from former MSWord files.
>
>HTML Tidy (see http://www.w3.org/People/Raggett/tidy/) is very good for 
>cleaning up HTML.

What I like on the HTML Tidy is that it can replace the <FONT...> and the
like 
things by more standard elements with CSS classes. It can also produce the
XML output (i.e. the differences between the original HTML and the wanted 
DocBook XML will be even smaller).  Then, using a good text editor of your
choice ;-),  it is much much easier to get the result.

I tried to XMLize the HTML from MS Word 97 earlier, before I knew HTML Tidy.
It was painful even with Perl in hands.  The (free) HTML Tidy can really
save
a lot of work.

HTH, Petr
-- 
Petr Prikryl, Skil, spol. s r.o., (prikrylp@skil.cz)



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC