OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office-comment] Implicit conversion and medical data


David A. Wheeler wrote:
> Patrick Durusau:
>> Your mentioning of implicit conversion reminded me of a use case where 
>> conversions resulted in *unrecoverable* changes to gene names.
>> Or, in the summary by the authors:
>>> A little detective work traced the problem to default date format 
>>> conversions and floating-point format conversions in the very useful 
>>> Excel program package. The date conversions affect at least 30 gene 
>>> names; the floating-point conversions affect at least 2,000 if Riken 
>>> identifiers are included. These conversions are irreversible; the 
>>> original gene names cannot be recovered.
>> You can see the whole article at: 
>> http://www.biomedcentral.com/1471-2105/5/80
>> Is this an application level issue, i.e., should come with default 
>> conversions disabled, or something that the standard should address?
> It doesn't have anything to do with Formulas. It really doesn't have
> anything to do with the OpenDocument format, either.
Well, but so far as I can tell there is no prohibition in place in 
OpenDocument to prevent someone from constructing an application that 
behaves as you describe even though it is using OpenDocument format. Yes?
> You can see the basic issue here:
> http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q214233
> Here's the issue:
> * When you type text into Excel, it tries to automatically figure out what
>    kind of data it is (date, number, ordinary string, etc.). By itself, that's fine;
>    there are ways to override that when you type in data directly into Excel, or
>    when you manually invoke an "Import" of data (by identifying specific columns
>    as "text").  BUT....
> * If a second program invokes Excel and gives it an "import" command,
>    telling Excel to read in data from other data formats, it uses
>    those same automatic conversions, and there's no easy way to disable it.
>    Corrupting the data.
> The only reasonable solution is to pre-process the data BEFORE it's loaded into
> Excel (e.g., by inserting a space before the data).
> I don't see how OpenDocument's spec can control the functionality of an
> app when reading in _NON_ OpenDocument files.  We could give a warning
> about the issue somewhere, but it's really about the functionality of loading in
> NON-OpenDocument files.
I think a warning, whether in the standard or not is another issue, 
would be a good idea. I would assume data corruption would be something 
of interest to the soon to be formed implementation/interoperability TC.

Hope you are having a great day!

> --- David A. Wheeler

Patrick Durusau
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]