OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office] Archives of ODF files?


On Monday 28 December 2015 16:30:29 Patrick Durusau wrote:
> Greetings!
> 
> I have been extracting the content.xml files from ODF files in an
> effort to gauge the variety of the XML being produced by ODF
> applications.
> 
> Since Google has lacked the foresight to enable limiting search
> results by ODF file types, :-(, I thought members of the TC might know
> of existing ODF file collections that I could repurpose?
> 
> I want to avoid canned example documents because they are unlikely to
> reflect the experience in the "wild" as it were with ODF documents.
> 

Here is a script to download many ODF files at once.
The original is here:
  https://quickgit.kde.org/?p=calligra.git&a=tree&f=devtools%2Fscripts

Despite the name it also works for odt, ods and odp.
You need to have the perl modules LWP::Protocol::https and LWP:;UserAgent 
installed.

To download a 100 odp files about protein, run

perl downloadMSOfficeDocuments.pl 100 protein odp

Cheers,
Jos

Attachment: downloadMSOfficeDocuments.pl
Description: Perl program



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]