OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office-formula message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: Very Weak String Support in ODF


Leonard Mada posted the following in the public comments section:
>The current Formula Specification and OOo implementation thereof have a  very weak support for strings. Even fundamental string functions are lacking This makes the use of ODF a poor choice for many research fields, where a lot of the data is in text format.

>I especially miss the following functions:
>1.) an extension to FIND() and SEARCH() that returns '0' IF string is  NOT found, instead of the '#NA!" (greatly eases work with complex searches)
>2.) count number of occurrences of a substring  within a string, e.g.
>       COUNTSUBSTR("I11.0; E11.5;  I25.5", "I") = 2
>3.) SPLIT string into substrings => perform operations on the substrings
>       SPLIT("I11.0; E11.5;  I25.5", ";", IGNORE = " \n\t", SORT = TRUE)
>       returns an array("E11.5", "I11.0", "I25.5")
>       SPLIT("I11.0; E11.5;  I25.5", ";", IGNORE = " \n\t", SORT = TRUE, pos = 1)
>       returns "E11.5"
>       SPLIT("I11.0; E11.5;  I25.5", ";", IGNORE = " \n\t",  RETURN_NUMBER = TRUE)
>       returns (int) "3"
>...
>
>For many more examples, see any text-oriented programing language.

Thanks for your comment!

I'm not fundamentally opposed to adding a few string functions (primarily to the "Large" group), but generally we've only been including functions which are ALREADY implemented in at least ONE spreadsheet application (Excel, Gnumeric, Corel Word Perfect Quattro Pro, KSpread, Lotus 1-2-3, etc.).  There's no end to the number of functions that COULD exist, and the lack of presence in ANY implementation is a sign that perhaps this isn't as widespread a need as you'd think. Spreadsheets tend to not be used for a lot of text processing (though which is the cause and which is the effect could be argued, I guess).

Also, there's always a risk that "committee invention" will have implementation or usage problems.  Standards are generally better when they stick with what's already in use, and make sure that they're extensible for future experimentation.  Yes, such functions certainly exist in other languages (so the risk is lower), but there's still a risk that a definition would have a hidden problem unless at least ONE supplier has implemented it with the other functions that ARE defined.  #3 in particular presupposes support for arrays of variable size, which certainly NOT all implementations support (nor do they need to for many use cases).  #1 and #2 are easy enough, though I'd like to know what to name #1.

It's certainly a reasonable idea.  Is there anyone else who believes we should add these functions at this time?  If so, can you point to a reasonable definition for them?  If there's significant support, let's do it... but if not, let's wait.

Actively-used standards, like OpenDocument, are typically revised occasionally.  So there's reason to believe that there will be a future opportunity, if they aren't included at this time.  In particular, OpenFormula includes a naming scheme that will allow implementations to add new functions that don't interfere with others, and then the ones that become popular can be added to the standard space.

--- David A. Wheeler 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]