Subject: Re: [office-comment] Re: Very Weak String Support in ODF
Leonard, Leonard Mada wrote: > David A. Wheeler wrote: > >> I'm not fundamentally opposed to adding a few string functions >> (primarily to the "Large" group), but generally we've only been >> including functions which are ALREADY implemented in at least ONE >> spreadsheet application (Excel, Gnumeric, Corel Word Perfect Quattro >> Pro, KSpread, Lotus 1-2-3, etc.). There's no end to the number of >> functions that COULD exist, and the lack of presence in ANY >> implementation is a sign that perhaps this isn't as widespread a need >> as you'd think. Spreadsheets tend to not be used for a lot of text >> processing (though which is the cause and which is the effect could >> be argued, I guess). >> > > > I like to correct this. > > Spreadsheets are *THE MOST* used input medium that researches use in > biomedical and life-sciences. This is both true of my country and of > ALL western countries I am aware of. Actually, spreadsheet use will > likely increase in the future, as more data gets digital. My position > permits me to fairly accurately predict this. > Ok, then there should be some listing or a means to derive such a listing of the functions that are in actual use. Yes? In other words, simply consulting a string function textbook isn't going to be much help in determining what should be in or out of a formula standard. For example, if there was a listing by frequency of use, of string functions used by genome researchers, then an attempt could be made to add some portion of those to a standard. But, realize that standardizing a string function that does not have a generally accepted semantics would probably be a bad idea. That is to say we should not standardize a function that is going to give some people an expected result but mislead others as to the actually result of the function. It is always possible that someone will get an unexpected result but that should be the exception rather than the rule. But I think we should avoid taking sides where there is no a clear consensus on the semantics of a particular function. You are probably aware of all the variations in regex syntaxes, including the choices made in XML Schema that are inconsistent with most other regex languages. That sort of variance doesn't help anyone. So, I think yes, some string functions might attract enough support to be included but: 1. There needs to be some showing of usage so we can judge between the essential or popular vs. 1 person uses this sometimes (there is effort involved in adding this sort of thing to a standard), and 2. As much information on how to define the function, including references to where more information can be found about the suggested function. Since I was trained as a text critic I am not unsympathetic to the need for string functions but I also think that the more detailed and helpful a request is in that regard the more likely it is to attract the interest of the committee. > Of course, custom solutions could replace spreadsheets, BUT then only > because spreadsheets did NOT correct the primary design flaws. > > So, the current stance prohibits the effective use of spreadsheets in > a big segment of users. The life sciences are NOT the only one > affected, actually, a much broader segment of researchers are struck > with these shortcomings, and even commercial businesses. I am working > in a governmental office and some 30-40% of ALL spreadsheets, done by > ALL employees, contain significant portions of text. (I mean with text > that is analysable, NOT just labels or descriptive text!) > >> Also, there's always a risk that "committee invention" will have >> implementation or usage problems. Standards are generally better >> when they stick with what's already in use, and make sure that >> they're extensible for future experimentation. Yes, such functions >> certainly exist in other languages (so the risk is lower), but >> there's still a risk that a definition would have a hidden problem >> unless at least ONE supplier has implemented it with the other >> functions that ARE defined. #3 in particular presupposes support for >> arrays of variable size, which certainly NOT all implementations >> support (nor do they need to for many use cases). #1 and #2 are easy >> enough, though I'd like to know what to name #1. >> > > > STANDARDS should NOT depend on the implementation. That would be a > poor standard. It usually should be the other way round. ;-) > Well, actually there are two quite legitimate positions in that regard. One position, obviously the one you prefer, is that standards should be out in front of practice. Examples of that include the processor standards that are fixed years in advance of actual design and production of microprocessors. The other, at least as well represented as the first, is that standards codify existing practice so that everyone does some activity the same way. Usually after attempts with varying success of a variety of methods. One emerges as a defacto standard with some variation and a standard is made to fix all of the details to enhance interoperability. OpenDocument is something of a mix of the two. There are innovations forthcoming in metadata support, for example, but as I understand the formula work (I am not actually a participant in that SC) it is a question of imposing some minimal order on the chaos that is the realm of formulas. Note that I said *minimal* order. There was no attempt to ferret out every possible formula or function. It might be helpful to remember that with few exceptions most of the people working on OpenDocument have day jobs that are not primarily related to standards work. So the effort doesn't go around hunting for things to work on. ;-) On the other hand, suggestions, particularly those with enough details to both justify and assist in the adding something to the standard are very likely to attract favorable attention. Hope you are having a great weekend! Patrick -- Patrick Durusau Patrick@Durusau.net Chair, V1 - Text Processing: Office and Publishing Systems Interface Co-Editor, ISO 13250, Topic Maps -- Reference Model Member, Text Encoding Initiative Board of Directors, 2003-2005 Topic Maps: Human, not artificial, intelligence at work!