OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

oic message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [oic] xml element/attribute coverage/analysis


"Hanssens Bart" <Bart.Hanssens@fedict.be> wrote on 02/27/2009 04:02:56 AM:

>
> 
> 
> 1) a tool that would process a directory of ODF-documents, counting the
> elements, attributes and attribute values being used in these documents
> 
> Using a high-level ODF library like jOpenDocument, it should be fairly
> easy to generate a report, for example in ODS, saying something like
> 
> * text:p, used 716547 times, found in 685 documents (out of 687 docs)
> * meta:user-defined, used 53 times, found in 43 documents (out of 687)
> 


I have some python code that I should be able to adapt to do this.  But 
note that there are probably two metrics of interest:

1) Which ODF "features" (elements/attributes/attribute values)are used 
most frequently by raw counts?
2) Which features are used in most documents?

It is possible that a particular element is used only once per document, 
but is used in every ODF document.  If so it is important, since everyone 
will need to understand it.  In other cases, a particular element will be 
repeated a thousand times when it is used, but is used only in rare 
documents.  So there are two different kinds of "popularity" here.

Also, does anyone have a large collection of ODF documents that could be 
tested in this way?  I suppose, even if you had such documents 
internally,you could use the python script to do the study and just upload 
the results.  This is easier than uploading the documents themselves.

-Rob


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]