unitsml message

Subject: Re: [unitsml] "Float64" in UnitsML
From: "Peter J. Linstrom" <peter.linstrom@nist.gov>
To: <unitsml@lists.oasis-open.org>
Date: Tue, 30 Sep 2008 13:42:11 -0400
All,

   This message is also intended to be part of the
asynchronous portion of the September 2008 UnitsML
TC meeting.

   Please note that this TC discussed conversion
factor representation during three meetings in 2007:
March 28, April 11, and May 9. As a result of these
discussions the current "Float64ConversionFrom" element
was adopted. Part of the rationale for the use of
xsd:double was based on input from groups that would
likely be using UnitsML. Prior to these meetings
UnitsML used a string based storage method for conversion
factors. There were several arguments in favor the
use of double precision numbers, but as I recall
the most persuasive was that this approach would be
more developer friendly (and hence ease adoption
of UnitsML).

   I am not sure there is a strong case for
representations greater than double precision.
Conversion factors are typically of two types

1.) Conversions set by standards or rules. These
    conversions are adopted based on conventions
    and are typically fixed decimal point numbers
    (e.g. an international foot is 0.3048 meters).
    I am not aware of any of these conversions
    which would come close to using the number
    of digits available in double precision numbers.
2.) Conversions based on experimental results.
    These conversions are for units based on physical
    properties and thus are based on the values of
    physical properties in other unit systems (e.g.
    measurng the electron volt in Joules). Experiments
    which are reliable to more than a dozen digits
    are extremely rare.

   In addition I would like to note that UnitsML
was designed to support multiple semantics for
conversions; it currently supports two:
"Float64ConversionFrom" and "SpecialConversionFrom".
Addition of a new type does not require the removal
of an existing type.

Peter Linstrom

======================================================
Peter J. Linstrom
NIST, Chemical and Biochemical Reference Data Division
Phone: (301) 975-5422
======================================================

----- Original Message ----- 
From: "Martin S. Weber" <martin.weber@nist.gov>
To: <unitsml@lists.oasis-open.org>
Sent: Monday, September 29, 2008 11:55 AM
Subject: [unitsml] "Float64" in UnitsML


Heyas...

This message is intended to be part of the "asynchronous" part of the
September UnitsML discussion.

This is a suggestion to rename "Float64ConversionFrom" to either  
"ConversionFrom" or to "LinearConversionFrom" (latter preferred) and  
to change the datatypes of involved values from xsd:double to  
xsd:decimal (but see below).

Rationale:

- Using Float64 as part of the name implies the use of 64 bit floating  
point numbers, to programmers even that the standard IEEE-754 is used  
or to be used with that data. While this might be the most precise and  
big type of floating point numbers available at this moment in common  
programming languages, even the "big" languages like Java now have  
support for "BigDecimals", that is numbers that exceed not only 64  
bits but if need be also 64 MegaBytes or whatever random limit you set  
yourself. So I think that using Float64 as part of the name gives the  
wrong picture of the values involved.

- What we store inside Float64ConversionFrom really is a linear  
equation, whose value and image domain isn't reduced to the set of  
valid IEEE-754 numbers, but actually is bigger than that.

- Even if we say, this is a floating point conversion (or the results  
will be or ...) we really want to say, there are real numbers involved  
here. Today's computer already offer more-than-64 bit of floating  
point registers (think SSE e.g.) and it's just a matter of time when  
e.g. 128bit floats will make it into the mainstream language standards  
(they're there already, "long double" rings a bell for anybody?  
quadruple precision on PowerPC e.g., 80 to 96 bits on x86). So again  
the implication by the name just seems wrong to me; and once a bigger  
float type comes out we'll have to rename this element?

So in sum I think a "LinearConversionFrom" (or "ConversionFrom")  
transports better what really is described inside the element.

Rationale for preferring LinearConversionFrom over ConversionFrom:
- we're storing a linear equation in there, not any kind of equation
- Mirroring the other prefixes "Special" and "WSDL". If there exists a  
"Special" conversion from something, I'd kind of expect there to be a  
normal one, too (this might be interpreted as an argument pro no  
prefix), BUT:
- we cannot express *all* conversions within that element (see above  
[nudge Peter, we're not using MathML:)]) so I'd prefer a reminder that  
this is only a linear conversion expressed inside.

Rationale for going from xsd:double[1] to xsd:decimal[2]:

- Actually it's just the arguments continued for going from  
Float64ConversionFrom to (Linear)ConversionFrom. We shouldn't be  
restricting the possible numbers to the IEEE-754 floats, which is what  
we do right now with using xsd:double as the datatype of these values.
- I didn't check whether all values specified in the UnitsDB right now  
can be precisely represented as (normalized) IEEE-754 64-bit floating  
point numbers right now. Were I to bet blindly, I'd set my money on  
"no" being the answer. There's just too much periodism involved in  
binary numbers. This problem might not exist with other, existing  
"BigDecimal" implementations.
- As you can see at [2], the value domain of xsd:decimal includes  
every possible xsd:double, so we're not restricting the information.

BUT:

- xsd:decimal does not allow the 'e' notation of numbers. So even if  
the value domain is wider, the allowed lexical representation is  
narrower. If we want to fix that, we probably should be using an  
xsd:union of xsd:decimal with xsd:anyType matching something like  
[+-]?[0-9]+(.[0-9]+)?[eE][0-9]+ . At some point we're losing the  
support of existing data-binding tools probably. If we rather use a  
custom, union type, the above points of course also apply to that  
custom type.
- xsd:double probably has a wider support in terms of data-binding  
tools (but don't pin me on that).

So, what do you think?

- Martin S. Weber

[1]: http://www.w3.org/TR/xmlschema-2/#double
[2]: http://www.w3.org/TR/xmlschema-2/#decimal

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
Follow-Ups:
- Re: [unitsml] "Float64" in UnitsML
  - From: "Martin S. Weber" <martin.weber@nist.gov>
References:
- "Float64" in UnitsML
  - From: "Martin S. Weber" <martin.weber@nist.gov>