OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [docbook-apps]PDF downconversion to docBook XML

Hi Kurt,

  From my very limited experience I found that kword did a pretty good
job at importing PDF. I used also OpenOffice to write out -poor-
  You should be able to import your PDF file directly in KWord and
write out (X)HTML file. Watch out that all your formatting will be
lost (no more title, section...).

  I used the following script to convert HTML to docbook:


  But in my case, my input HTML was -somewhat- organized.

Good luck

On Fri, Jun 11, 2010 at 11:31 PM, Kurt A Richardson
<kurt@iscepublishing.com> wrote:
> Hi list
> I am new to DocBook, and XML-based publishing in general.   I run a small
> publishing company (30 titles), that specializes in complexity theory and I
> have been looking for ways to not only improve my little doc flow
> methodology, but also make our content available to our readers in a variety
> of new modes and formats.  I have been drawn to DocBook and the possibility
> of using XSLT as a means to realize these goals.  I have little trouble
> figuring out how to prepare new content and am hoping to produce our next
> two titles purely from DocBook XML.  However, I also have about 6000 pages
> of PDFs (not all having the same format) that I'd like to 'down convert' to
> DocBook XML.  I am making SLOW progress and wondered if anyone here had any
> bright ideas about how to approach this task... e.g., is PDF to html the
> best first step?  Or does anyone know of any affordable services being
> provided to do the down conversion for me.
> Many thanks in advance for any guidance you can provide.
> I'm really rather excited about the possibilities that arise once I move our
> publishing from Adobe CS to XML-based!
> Kind regards, Kurt
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]