OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [docbook-apps] [SPAM] What is best practize to convert MS Word into PDF through DocBook XSL?


If the Word document is not too fancy, I sometimes have success saving it 
as HTML in Word and then using dbdoclet to convert the HTML to DocBook. 
You can get dbdoclet from here:

http://www.dbdoclet.org/

Using it is a two-step process:

1.  In Word, use Save As and select Web Page, Filtered as the type.

2.  Run dbdoclet like this:

java -jar <path-to-dbdoclet>/doclet/lib/html2db.jar -i input.html -o 
output.xml

Any time you are converting from Word, expect to do some cleanup.  Word is 
totally unstructured, and converters have to infer structure from the 
typography.

Bob Stayton
Sagehill Enterprises
DocBook Consulting
bobs@sagehill.net


----- Original Message ----- 
From: "Darya Said-Akbari" <darya_akbari@yahoo.com>
To: <docbook-apps@lists.oasis-open.org>
Sent: Monday, August 28, 2006 12:39 AM
Subject: [docbook-apps] [SPAM] Re: [docbook-apps] [SPAM] What is best 
practize to convert MS Word into PDF through DocBook XSL?


> Hi,
>
> I took a look at AbiWord. Unfortunately the export
> from MS Word to AbiWord is not usable (at least not
> for my corporate documents).
>
> Any more hints how to fill the gap between MS Word and
>
> DocBook XML?
>
> Regards,
> Darya
>
> --- Andreas Reuleaux <reuleaux@web.de> schrieb:
>
>> On Fri, Aug 25, 2006 at 05:11:35PM +0200, Darya
>> Said-Akbari wrote:
>> > Hi,
>> >
>> > I experimented the last days with DocBook XML and
>> XSL
>> > and I must say that with Bob Stayton's guide (I
>> have
>> > ordered a copy for our company) it's really making
>> > fun.
>> >
>> > However, I must admit that I created a clean
>> DocBook
>> > XML from Norm Walsh's examples. And both DocBook
>> XML
>> > and DocBook XSL work wonderful with each other.
>> >
>> > But reality looks different. You don't have that
>> > superb DocBook XML document at first. What you
>> have
>> > instead is e.g. a MS Word document which you first
>> > need to transform into a DocBook XML document.
>> >
>> > And this is a challenging task. I tried OpenOffice
>> > 2.0.3 Writer (OOo) as my front end which allow MS
>> Word
>> > documents to be exported into DocBook XML. The
>> result
>> > however is very poor. Extracting graphics was a
>> > problem. But a far bigger problem with OOo (and I
>> > didn't solve that one yet) is that it uses only
>> 10% of
>> > the DocBook XML elements. Everything goes into
>> <para>
>> > and hence I can't use all the DocBook XSL goodies.
>> >
>> > Does anyone else has made the same experience? How
>> do
>> > you get around with it? Is there another way to
>> fill
>> > the gap between MS Word and DocBook XML instead of
>> > using OOo.
>> >
>> > Of course only a non-commercial solution is
>> acceptable
>> > :)
>>
>> Abiword can read MS Word documents and has a docbook
>> export filter -
>> you can read in the release notes of the newest
>> version (v2.4.5), that
>> this filter has completetly been rewritten
>> (http://www.abiword.com/release-notes/2.4.5.phtml):
>>
>>   The changes from v2.4.4 to v2.4.5 include, amongst
>> others:
>>
>>   * Almost completely rewrote the DocBook export
>> filter, gaining
>>     substantial more functionality in the process
>>
>> I have given it a try on debian sid (this filter is
>> in the
>> abiword-plugins package):
>>
>>   # apt-get install abiword abiword-plugins
>>
>> When choosing "Save As" I get a chance to save my
>> document as
>> DocBook. - I can't comment on the quality of the
>> export output though
>> (especially in comparison to Oo.o), haven't used it
>> for further
>> processing yet.
>>
>> Hope this helps.
>>
>> -Andreas
>>
>> >
>> > Regards,
>> > Darya
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
> ___________________________________________________________
>>
>> > Der frühe Vogel fängt den Wurm. Hier gelangen Sie
>> zum neuen Yahoo! Mail: http://mail.yahoo.de
>> >
>> >
>>
> ---------------------------------------------------------------------
>> > To unsubscribe, e-mail:
>> docbook-apps-unsubscribe@lists.oasis-open.org
>> > For additional commands, e-mail:
>> docbook-apps-help@lists.oasis-open.org
>> >
>> >
>> >
>> > !DSPAM:44ef3180184581336712104!
>>
>>
> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> docbook-apps-unsubscribe@lists.oasis-open.org
>> For additional commands, e-mail:
>> docbook-apps-help@lists.oasis-open.org
>>
>>
>
>
>
>
> ___________________________________________________________
> Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
>
>
> 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]