OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] html2docbook: issue with <h2/>


Hi Bob,

  Thanks for the info. Indeed herold seems at first to be giving
slighlty better result (at least it parses <h2> as I expected).
However it turn all links to href, instead of doing what I think is
smarter:

"Absolute links (starting with "http://";) remain absolute and become
<ulink>s. Other links become <xref>s."
Ref: http://wiki.docbook.org/topic/Html2DocBook#head-f915b9937f1226e0abb325e6a6335e12d20be0c4

There does not seems to be a best solution here. But the original
html2docbook is open-source so I can always fix it.

Thanks

On Wed, Jan 27, 2010 at 6:39 PM, Bob Stayton <bobs@sagehill.net> wrote:
> Looking the stylesheet, it appears to not be able to generate nested
> sections from
> <h1>, <h2>, etc. headings.  It only generates a single <section> element for
> the
> entire HTML file.
>
> The basic challenge is that plain HTML has a linear structure, in which <h2>
> is just
> another block element like <para>, while DocBook has a nested structure
> where
> <section> contains <title> and <para> and other <section> elements.
>  Converting a
> linear structure to a nested structure requires a more complex stylesheet
> than this
> one.
>
> I would suggest you try the standalone HTML-to-DocBook conversion tool
> "herold", which
> used to be part of the dbdoclet Java app but is now a stand alone app.  It
> can be
> downloaded from:
>
> http://www.dbdoclet.org/
>
> Bob Stayton
> Sagehill Enterprises
> bobs@sagehill.net
>
>
> ----- Original Message ----- From: "Mathieu Malaterre"
> <mathieu.malaterre@gmail.com>
> To: <docbook-apps@lists.oasis-open.org>
> Sent: Tuesday, January 26, 2010 8:03 AM
> Subject: [docbook-apps] html2docbook: issue with <h2/>
>
>
>> Hi there,
>>
>>  I'd like to know if anyone is using the script from the page:
>> http://wiki.docbook.org/topic/Html2DocBook
>>
>>  I tried on a very tidy example:
>>
>> $ cat input.xhtml
>> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
>>   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
>> <html xmlns="http://www.w3.org/1999/xhtml";>
>> <body>
>> <h1>Title1</h1>
>> <p>bla 1</p>
>> <h2>Title2</h2>
>> <p>bla 2</p>
>> </body>
>> </html>
>>
>>
>> Here is what I get as output:
>>
>> $ cat output.xml
>> <?xml version="1.0"?>
>> <section>
>>  <title>Title1</title>
>>  <para>bla 1</para>
>>  <para>bla 2</para>
>> </section>
>>
>> The title in <h2> element is lost during the conversion.
>>
>> Any idea on how to fix that ?
>>
>> Thanks,
>> --
>> Mathieu
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
>> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
>>
>>
>>
>
>



-- 
Mathieu


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]