[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office-comment] DISTINCT Values
Leonard, Another interesting comment! Hmmm, but isn't it true that what OpenDocument should define are the operators that work with DISTINCT (and other functions) such that you can write your own functions? What concerns me is that no matter how large the eventual list, we are going to miss that special function that someone has to have. If we give them the ability to use standard operators with which to write their specialized functions, that lessens the load on us and gives users greater freedom. Hope you are having a great day! Patrick Leonard Mada wrote: > I strongly miss a function to return the number of DISTINCT values > existent in a given cell range. To my knowledge, this functionality is > missing in every spreadsheet application, although most of the > research involves such analysis. > > I would be further interested to perform some operations using these > distinct values. > > Functions: > DISTINCT( 'cell_range', AS.TEXT = TRUE, IGNORE.CASE = TRUE) > returns number of distinct strings in the cell range > > > DISTINCT( 'cell_range', AS.TEXT = FALSE, TOLERANCE = 0) > returns number of different values > DISTINCT( 'cell_range', AS.TEXT = FALSE, TOLERANCE = 0.5, ORDER = > "ASCENDING") > returns number of different values; values within TOLERANCE are > considered EQUAL > - values are ranked first using the specified ORDER > - IF( x[i] is within TOLERANCE of x[i-1]), the 2 values are > considered equal > DISTINCT( 'cell_range', AS.TEXT = FALSE, TOLERANCE = '5%', ORDER = > "ASCENDING") > returns number of different values; values within TOLERANCE are > considered EQUAL > - values are ranked first using the specified ORDER > - IF( x[i] is within TOLERANCE of x[i-1]), the 2 values are > considered equal > - the absolute value for TOLERANCE is computed as x[i-1] * TOLERANCE > > > DISTINCT( 'cell_range', AS.TEXT = FALSE, 'cell_range2') > returns number of different values within 'cell_range' > - 'cell_range2' describes the bounds/intervals used for splitting the > initial data (i.e. for splitting 'cell_range') > - the values within 'cell_range2' are ranked > - IF any value from 'cell_range' < MIN('cell_range2') => this is > first group > - any value from 'cell_range' is within > RANKED(cell_range2) > RANKED(cell_range2) => next group > - ... > - any value from 'cell_range' > MAX('cell_range2') => this is the > last group > > Of course, DISTINCT() is just one aspect of the analysis. Actually, I > am more interested in doing specific calculations based on these > distinct groups (like SUM(), COUNT(), ...). I will describe such > calculations in a later post. > > Sincerely, > > Leonard Mada > > This publicly archived list offers a means to provide input to the > OASIS Open Document Format for Office Applications (OpenDocument) TC. > > In order to verify user consent to the Feedback License terms and > to minimize spam in the list archive, subscription is required > before posting. > > Subscribe: firstname.lastname@example.org > Unsubscribe: email@example.com > List help: firstname.lastname@example.org > List archive: http://lists.oasis-open.org/archives/office-comment/ > Feedback License: http://www.oasis-open.org/who/ipr/feedback_license.pdf > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php > Committee: > http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office > > > -- Patrick Durusau Patrick@Durusau.net Chair, V1 - Text Processing: Office and Publishing Systems Interface Co-Editor, ISO 13250, Topic Maps -- Reference Model Member, Text Encoding Initiative Board of Directors, 2003-2005 Topic Maps: Human, not artificial, intelligence at work!