[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] html2docbook: issue with <h2/>
Hi Bob, Thanks for the info. Indeed herold seems at first to be giving slighlty better result (at least it parses <h2> as I expected). However it turn all links to href, instead of doing what I think is smarter: "Absolute links (starting with "http://") remain absolute and become <ulink>s. Other links become <xref>s." Ref: http://wiki.docbook.org/topic/Html2DocBook#head-f915b9937f1226e0abb325e6a6335e12d20be0c4 There does not seems to be a best solution here. But the original html2docbook is open-source so I can always fix it. Thanks On Wed, Jan 27, 2010 at 6:39 PM, Bob Stayton <bobs@sagehill.net> wrote: > Looking the stylesheet, it appears to not be able to generate nested > sections from > <h1>, <h2>, etc. headings. It only generates a single <section> element for > the > entire HTML file. > > The basic challenge is that plain HTML has a linear structure, in which <h2> > is just > another block element like <para>, while DocBook has a nested structure > where > <section> contains <title> and <para> and other <section> elements. > Converting a > linear structure to a nested structure requires a more complex stylesheet > than this > one. > > I would suggest you try the standalone HTML-to-DocBook conversion tool > "herold", which > used to be part of the dbdoclet Java app but is now a stand alone app. It > can be > downloaded from: > > http://www.dbdoclet.org/ > > Bob Stayton > Sagehill Enterprises > bobs@sagehill.net > > > ----- Original Message ----- From: "Mathieu Malaterre" > <mathieu.malaterre@gmail.com> > To: <docbook-apps@lists.oasis-open.org> > Sent: Tuesday, January 26, 2010 8:03 AM > Subject: [docbook-apps] html2docbook: issue with <h2/> > > >> Hi there, >> >> I'd like to know if anyone is using the script from the page: >> http://wiki.docbook.org/topic/Html2DocBook >> >> I tried on a very tidy example: >> >> $ cat input.xhtml >> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" >> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> >> <html xmlns="http://www.w3.org/1999/xhtml"> >> <body> >> <h1>Title1</h1> >> <p>bla 1</p> >> <h2>Title2</h2> >> <p>bla 2</p> >> </body> >> </html> >> >> >> Here is what I get as output: >> >> $ cat output.xml >> <?xml version="1.0"?> >> <section> >> <title>Title1</title> >> <para>bla 1</para> >> <para>bla 2</para> >> </section> >> >> The title in <h2> element is lost during the conversion. >> >> Any idea on how to fix that ? >> >> Thanks, >> -- >> Mathieu >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org >> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org >> >> >> > > -- Mathieu
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]