[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Word 2007+ to DocBook
Greg, Here is my 2cts. I also used OO.org to convert from .doc (yes the old binary format) to docbook. However even if I submited bug report + patch: http://www.openoffice.org/issues/show_bug.cgi?id=110762 http://www.openoffice.org/issues/show_bug.cgi?id=110872 there hasn't been any change in OO.org. I am not to sure about docbook support in OO.org. I also used at some point Steve Ball rountrip stuff, but again none of the bug I reported have been fixed: http://sourceforge.net/tracker/?limit=25&func=&group_id=21935&atid=373747&assignee=&status=&category=&artgroup=&keyword=&submitter=malat&artifact_id=&assignee=&status=&category=&artgroup=&submitter=malat&keyword=roundtrip&artifact_id=&submit=Filter So I would not expect too much in those directions. I have heard good things about majix. So I would suggest you maintain documentation in docbook and only generates (one-way) RTF for WinWord people. HTH On Fri, May 7, 2010 at 3:20 PM, <gpevaco@aol.com> wrote: > Steve: > > Thanks for the reply. I think I was a little unclear regarding the roundtrip > aspect. > I am not so interested in round trip, (but it would be nice to > have) primarily I want to go one way into word. > However the stylesheets I found happened to be in the 1.75.2/roundtrip > directory. > > I do appreciate the info on the docx files. I had known that the Microsoft > Word .docx format is essentially a zip file with a whole bunch of XML files > in there, but was not aware that the main document content was in > document.xml > I have looked at that file, it is still really ugly straight out of word, > but I am going to concentrate on refining that document down and see if that > yields any better results. Investigation ongoing. > > Another option I had tried was to use openoffice.org writer as it has save > as docbook.xml, seemed promising but I am not really impressed with the > resulting XML. > I tried a couple of approaches, one was to save the document in Word as > .odt, then in OpenOffice Writer save as a docbook.xml... > The other approach was to just open the word doc in openoffice, and let oo > convert it and then save as docbook. > That approach generated xml document that validated, but there are still > some things (really ugly!) that I would like to see improvement. > I wrote a few custom xslt to clean up some of it, and that is also ongoing > work in progress. > I would love to hear some other suggestions/options if anyone has already > gone through all this! > If you don't mind, when you do have a decent round trip scenario worked out, > sharing it, just e-mail me directly, or post a link here. As I said > initially I got lots of advice from old discussions here in the archives, > who knows perhaps this can help someone else down the line. > It would be very nice to have. > > Thank you very much for your kind reply! > /Greg > > > > -----Original Message----- > From: Steve Ball <Steve.Ball@explain.com.au> > To: gpevaco@aol.com > Cc: docbook-apps@lists.oasis-open.org > Sent: Thu, May 6, 2010 6:49 pm > Subject: Re: [docbook-apps] Word 2007+ to DocBook > > Hi, > The stylesheet structure was rationalised for the 1.75.2 release so that > Word, Pages and OpenOffice formats could all be supported. There is a > stylesheet for each of those formats that normalises the document to a > common format, and then the other stylesheets take the document through to > structured DocBook. > Office 2007 basically uses WordML under-the-hood, and a .docx "file" is > really just a Zip file containing the XML documents. The one with the > document content is word/document.xml. It wouldn't be too much work to > upgrade the roundtrip stylesheets to handle this document; basically it is > just the XML Namespace URIs that have changed. > I'm working on libxslt at the moment (implementing XSLT 2.0), so haven't > really got time to look at the roundtripping stuff. However, email me > directly if you have any further questions. > Cheers, > Steve Ball > On 07/05/2010, at 6:29 AM, gpevaco@aol.com wrote: > > Howdy DocBook Community: > > I am new to DocBook, and also new to this forum. I have been going through > the archives, and found some very interesting discussions. Primarily I am > interested in moving/converting some documents from Word which they were > authored in to DocBook. > I have been looking at several tools to help in this process, and found some > very good information here in the archives. > One method which seems very promising is the docbook-xsl/roundtrip > The discussion for this was from a few years ago. So I am thinking that > the some of the style sheets may have changed with the docbook-xsl-1.75.2 > distro that I have. The suggested conversions were: > > wordml-normalise.xsl, wordml-sections.xsl, wordml-blocks.xsl, > wordml-final.xsl > > none of which I found in the 1.75.2 > Instead I have xsl such as: > normalise-common.xsl, normalise2sections.xsl, sections2blocks.xsl, and > blocks2dbk.xsl > > It seems to me that this is just the logical evolution of the same xsl style > sheets referenced in the archives from years ago. Does anyone know if this > is indeed the case. > > Further there has been little to no discussion or even apparently any new > tools regarding converting Microsoft Word to DocBook at least for quite a > while. > Corresponding roughly to the time when Microsoft Word started > implementing XML or w:xml as I like to call it. It is still very ugly xml, > and even though the new docx format is apparently valid XML it is > still cumbersome to work with, at least in my opinion. > Are there any newer tools designed primarily to work with the latest > incarnation of w:xml or any techniques that could help the effort to get > these docs into DocBook? > I greatly appreciate any response! > > Thanks, > /GregP > > = -- Mathieu
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]