[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [opendocument-users] extracting the text from an opendocument file
Vincenzo Morgante <enzom83@yahoo.it> wrote on 01/08/2010 06:39:15 PM: > > [opendocument-users] extracting the text from an opendocument file > > Hi, > I'm developing a java class which have to be able in reading an > OpenDocument text file (with odt extension) in order to extract all > the text contained in it. > Some years ago I made a VB.NET library in following OpenDocument 1.0 > specifications. Now this library works still fine, but I'd like to > be sure that not be substantial changes in the newer versions of the > standard (1.1 and 1.2). > Could I follow the old OpenDocument 1.0 specifications without any > problems or would it be expedient to follow the newer specifications? > In other words, if I follow the old OpenDocument 1.0 specifications, > could I fall into problems in reading a file of the newer versions > with regard to the text extraction? > Tt depends on what you mean by "the text" in a document. Although the general text model remains the same through ODF 1.1 and 1.2, there are some enhancements. For example, ODF 1.1 gives the ability to add an alternative text description to an OLE embedding. Similarly, ODF 1.2 enhances the metadata model. I'm not sure if these are relevant to your text extraction problem, but those are the kinds of changes I would expect. In any case, if you look at the back of the ODF 1.1 spec, and the end of the draft of ODF 1.2 part I, you'll see a summary list of changes in each revision. You can decide based on that whether any are substantial for your purposes. Regards, -Rob
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]