OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Archives of ODF files?

On Monday 28 December 2015 16:30:29 Patrick Durusau wrote:
> Greetings!
> I have been extracting the content.xml files from ODF files in an
> effort to gauge the variety of the XML being produced by ODF
> applications.
> Since Google has lacked the foresight to enable limiting search
> results by ODF file types, :-(, I thought members of the TC might know
> of existing ODF file collections that I could repurpose?
> I want to avoid canned example documents because they are unlikely to
> reflect the experience in the "wild" as it were with ODF documents.

Here is a script to download many ODF files at once.
The original is here:

Despite the name it also works for odt, ods and odp.
You need to have the perl modules LWP::Protocol::https and LWP:;UserAgent 

To download a 100 odp files about protein, run

perl downloadMSOfficeDocuments.pl 100 protein odp


Attachment: downloadMSOfficeDocuments.pl
Description: Perl program

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]