OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office-comment] Implicit conversion and medical data


Patrick Durusau:
> Your mentioning of implicit conversion reminded me of a use case where 
> conversions resulted in *unrecoverable* changes to gene names.
> 
> Or, in the summary by the authors:
> 
> > A little detective work traced the problem to default date format 
> > conversions and floating-point format conversions in the very useful 
> > Excel program package. The date conversions affect at least 30 gene 
> > names; the floating-point conversions affect at least 2,000 if Riken 
> > identifiers are included. These conversions are irreversible; the 
> > original gene names cannot be recovered.
> You can see the whole article at: 
> http://www.biomedcentral.com/1471-2105/5/80
> 
> Is this an application level issue, i.e., should come with default 
> conversions disabled, or something that the standard should address?

It doesn't have anything to do with Formulas. It really doesn't have
anything to do with the OpenDocument format, either.

You can see the basic issue here:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q214233

Here's the issue:
* When you type text into Excel, it tries to automatically figure out what
   kind of data it is (date, number, ordinary string, etc.). By itself, that's fine;
   there are ways to override that when you type in data directly into Excel, or
   when you manually invoke an "Import" of data (by identifying specific columns
   as "text").  BUT....
* If a second program invokes Excel and gives it an "import" command,
   telling Excel to read in data from other data formats, it uses
   those same automatic conversions, and there's no easy way to disable it.
   Corrupting the data.

The only reasonable solution is to pre-process the data BEFORE it's loaded into
Excel (e.g., by inserting a space before the data).

I don't see how OpenDocument's spec can control the functionality of an
app when reading in _NON_ OpenDocument files.  We could give a warning
about the issue somewhere, but it's really about the functionality of loading in
NON-OpenDocument files.

--- David A. Wheeler


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]