OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: RE: DOCBOOK: converting to docbook


Hi David,

You're right, it's not pretty - actually it's quite ugly if you use adobe's
export to RTF or export to Text - you lose a lot of information.   This is
why we went through the adobe PDF api to get at the information contained
within a PDF document.

It is possible to extract information from PDF files and get useful XML
output that can be used for populating docbook.  But you have to make some
assumptions, which vary depending on the type of document you are processing
( resume, chapter, article, web page, etc.).  If you have a sense of this,
then you can list the assumptions as sets of rules and process
transformations based on these assumptions.


Thanks,
-Riz


------------------------------
Riz Virk, (617) 905-3518
riz@xyztechnologies.com, riz@alum.mit.edu
http://www.xyztechnologies.com


-----Original Message-----
From: David Cramer [mailto:dcramer@broadjump.com]
Sent: Tuesday, August 13, 2002 1:22 PM
To: docbook@lists.oasis-open.org
Subject: RE: DOCBOOK: converting to docbook


As a last resort, if the source files for the pdfs aren't available,
recent versions of acrobat can save/export to rtf and text. Not a pretty
sight tho.

David

> -----Original Message-----
> From: Bob Stayton [mailto:bobs@caldera.com]
> Sent: Tuesday, August 13, 2002 11:38 AM
> To: jonathon; docbook@lists.oasis-open.org
> Subject: Re: DOCBOOK: converting to docbook

> For your PDF documents, I'd look for the source document
> that generated the PDF.  It is tough (impossible?)
> to convert PDF.
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC