OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-apps message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: DOCBOOK-APPS: RE: DOCBOOK: MS files included with elements?

(first, sorry to Norman Walsh -- this should go here, not explicitly
to you ;-)

/ Galen Boyer <galenboyer@yahoo.com> was heard to say:
| Oh God, I'll probably get killed for this question.
| Is there some tag which can be used to include a word doc or
| excel file or other element?

I suppose that this would be extremely difficult.  I guess that you
should want to convert the doc into XML.  The following may help
you only if you want to do it once with the Word document.

I am very new to XML/SGML and DocBook, but I did the conversion
of say 150 pages Word document into XML.  I did it via exporting the
doc into HTML, and then I did a lot of perl fiddling... Now I have
well-formed XML, but not the DocBook markup, yet.

The process was rather painful -- because I did not know 
HTML Tidy program before!!!  (My thanks to Dave Raggett
who wrote it and to Jirka Kosek who mentioned it in his book.)

So, if I was forced to do it again, I would do it this way:

  1. Export the Word to HTML (manually).
  2. Use HTML Tidy (off line) do convert the <font ...> and the like
     tags into markup that uses CSS (automatically) and to
     output the XML result.
  3. Use ImageMagick to convert the images into the desired
     format (off line).
  4. Use some XSLT processor and write XSL file to prescribe 
     the conversion of that XML to DocBook XML (off line).
  5. Perl may still be needed.

Well, I never did the third step (being very new to XSL), nor I know
whether it is the best approach.  I guess that there could be some
easier way.  Anyway, I think that "Word to HTML" is the first step
to follow and I do not think that can be done off-line.

Any comments?  (I want to learn something better ;-)


Petr Prikryl, SKIL, spol. s r.o., prikrylp@skil.cz

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC