OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office] Proposal for Spreadsheets: New sort option "natural sort"


Peter,

we have discussed this in the TC work call yesterday, and the most 
reasonable solutions seems to provide both ways of natural sort, since 
there are use cases for both options.

I hope this helps

Michael


Martin Pool wrote:
> On  5 Feb 2007, "David A. Wheeler" <dwheeler@dwheeler.com> wrote:
>> Martin Pool:
>>>> If I'm reading this correctly, that means that "1.3" > "1.20", in a
>>>> locale where "." is the decimal separator.  In typical software version
>>>> strings that's not correct, and that was the case I was originally
>>>> trying to handle, and also apparently the case Robert Weir describes.
>>>> Obviously sometimes sorting as floats is best but I suggest that when
>>>> numbers are intermixed with non-digits the other algorithm is better.
>>>> That is, to basically follow this algorithm but just treat the decimal
>>>> separator as non-numeric.
>>
>> Michael Brauer:
>>> I think we have some kind of conflicting requirements here: You either 
>>> want to be able to sort floating point numbers. You then need to 
>>> interpret the decimal delimiter. Or you want to be able to sort version 
>>> numbers. You then must not interpret the decimal delimiter.
>>>
>>> What about resolving this conflict by having two options (or three, if 
>>> the include the default character code based sorting) instead of one, 
>>> "natural-integer" and "natural-float", where the first one sorts only 
>>> integer values, while the 2nd one sorts floats?
>> In all the "Natural sort" algorithms I've seen, the data is expected to include
>> trailing digits after a decimal point. So "1.30" > "1.20".
> 
> Sure, I can't think of why anyone would not want "1.30" < "1.20" --
> whether you treat them as plain strings, floats, or sequences of
> decimals.
> 
> The question is how "1.3" compares to "1.20" or to "1.20.2.2".
> 
> The whole point of this is not to expect humans to include leading or
> trailing zeros, because you will be disappointed.
> 
> I think it is fairly rare to have decimals embedded within an
> alphanumeric string, though I'm sure you can find examples.  It is
> common to have dots as separators.
> 
> Actually I can think of a case: "AUD$1.3" > "AUD$1.21".  In a spreadsheet of
> course you would expect them to be numbers with currency formatting, and
> this is one of the few cases where people reliably do add trailing
> zeros, so perhaps it is not very persuasive.
> 
>> If you truly want a "version sort", that's a different animal.  Perl, for example, includes separate routines for doing version sorts that are DIFFERENT from its natural sort implementation.
> 
> "Version sort" tends to imply special handling of conventions like
> "1.2beta2".
> 
> The first example I found of Perl natural sort works as my code does,
> and does not treat dots as decimals.
> 
> http://search.cpan.org/src/SALVA/Sort-Key-1.27/lib/Sort/Key/Natural.pm
> 
>> If you're doing something significantly different than the usual "natural sort", it should have a new name.
>> If we ARE going to include a "natural sort", is Michael's proposed definition the same as used elsewhere?
> 
> Google for "natural sort" finds my code, plus adaptations of it into
> ruby, php, etc, so I think this is the usual definition.  That does not
> necessarily mean it's the best term to present to users.
> 


-- 
Michael Brauer, Technical Architect Software Engineering
StarOffice/OpenOffice.org
Sun Microsystems GmbH             Nagelsweg 55
D-20097 Hamburg, Germany          michael.brauer@sun.com
http://sun.com/staroffice         +49 40 23646 500
http://blogs.sun.com/GullFOSS



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]