[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office] word-count
Hi Robert, On Monday, 2008-11-24 17:56:13 -0500, Robert Weir wrote: > Also, I'm not sure that the concept of "word" is clearly stated. It is > the kind of thing linguists go crazy about. We probably want to state > explicitly that the word-count refers to "orthographic words", which are > the groups of letters delimited by whitespace. This works fine for modern > languages. The ancient Greeks wrote their texts without spaces between > words. Nothing we can do about that. Even human experts have arguments > over where the word breaks go in those texts. So we can't expect a word > processor to figure it out. There are also "modern" languages that don't use whitespace at all, such as Khmer, for example. Writers _may_ insert the ZWSP U+200B character to help a word processing application. The situation for CTL languages may be completely different from what most are used to or would call "normal". But also in Western languages there may be differences whether constructs such as here-it-is counts as one word or more. This may vary between languages. I think we should not define how word count is to be computed, and not state anything that may be correct for some languages but wrong for others. Eike -- OpenOffice.org / StarOffice Calc core developer and i18n transpositionizer. SunSign 0x87F8D412 : 2F58 5236 DB02 F335 8304 7D6C 65C9 F9B5 87F8 D412 OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]