office message

Subject: Towards a more modular ODF
From: robert_weir@us.ibm.com
To: office@lists.oasis-open.org
Date: Sat, 23 Jul 2011 13:39:07 -0400
At the recent Plugfest in Berlin, I heard a number of people lament the 
pace of standardization for ODF 1.2.  We started ODF 1.2 in 2007 and are 
just now finishing it up in 2011.  Although this is not an outlier in the 
standardization world (ISO C++ took 6 years!) I think we can agree that it 
is far from ideal. 

One idea that I was brought up at the Plugfest was the idea of making ODF 
more modular,  meaning defining formal modules at the schema and 
specification level, and to standardize these modules independently, at 
whatever pace they naturally evolve.  We're partially down that road 
already with the three "parts" of ODF 1.2.  But since these are part of 
the same OASIS standard, we cannot evolve them at different paces.  The 
rigidity of this monolithic approach impacts our work in OASIS and in ISO.

The alternative would be to have a finer grained approach.  For example, 
without thinking too deeply on what the modules would be, something like 
this:

ODF Part 1 "Text"
ODF Part 2 "Drawing"
ODF Part 3 "Chart"
ODF Part 4 "OpenFormula"
ODF Part 5 "Packaging"
ODF Part 6 "Metadata"
ODF Part 7 "Word Processor Application"
ODF Part 8 "Spreadsheet Application"
ODF Part 9 "Presentation Application"

The idea would be to factor the schema such that the reusable portions, 
currently used, for example, by more than one application type, are pulled 
into their specification, in a logical fashion.  So we would seek high 
intra-module cohesion and low inter-module coupling.

Some of this factoring could be automated by examining the schema and 
noting the dependencies.  For example if element A contains element B and 
has attribute C, then A depends on B and C.  These dependencies could be 
extracted and sorted topologically or subjected to other forms of graph 
analysis to find the optimal way to define cohesive components/modules.

Another automation would be to ignore the schema and instead process a 
large collection of documents and to define the modules based on existing 
practice.  In other words, nothing in ODF 1.2 will tell us exactly what 
"features"  a word process has that are distinct from a spreadsheet.  But 
processing a few hundred sample documents will get us rather close, 
especially if it is a mix of documents from different implementations.

What I'm trying to avoid is the issue that we have, where we have good 
work being done on change tracking, or previously with OpenFormula, but we 
all need to wait until everything is done for a big coordinated release. 
If we had finer grained modules, each as their own standards, then we 
could would have a lot more flexibility. 

Another benefit is that we'd make it clearer how other apps could reuse 
our modules to create new specifications.  For example, I've heard 
interest in using ODF for purposes ranging from project management 
software to mindmaps to outliners.  Having a more modular set of 
foundational specifications would allow this.

Finally, we should keep in mind that a poor decomposition, where 
everything depends on everything else, would make our work much harder. So 
the real question is: how much modularity could we achieve?  And are the 
benefits of  that level of modularity worth the effort?

I'd be interested in the TC's thoughts on this.  Is this worth aiming for? 
 Is it doable?  Or is it "boiling the ocean"? 


-Rob
Follow-Ups:
- Re: [office] Towards a more modular ODF
  - From: "Andreas J. Guelzow" <andreas.guelzow@concordia.ab.ca>