[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [office] word-count (was Re: [office] Data Grid Size elementproposal)
Rob, robert_weir@us.ibm.com wrote: > "Andreas J. Guelzow" <aguelzow@math.concordia.ab.ca> wrote on 11/24/2008 > 05:59:28 PM: > >> But why is this information saved in the file? >> > > The reason is probably lost in the mists of time. This information has > always been stored in documents going back to precursor binary formats > since the late 1980's. It may come from common librarial practice, where > records of hard-copy resources would include the size of the book (in > inches and in pages) as well as title, author, subject, etc. My guess is > they added word-count to electronic documents as an analogue to that > practice. Remember in those days as well, the document format itself > might be proprietary and undocumented, but in Windows at least it was > common to store the metatadata as OLE Properties, which could be quickly > retrieved without understanding the underlying document format. So that > would be useful for search engines, document servers, etc., and any other > programs that operated on the document metadata. > > But that is all in the past. The same constraints don't necessarily exist > today. In particular, with a standard document format, the entire > contents of the document is open for reading/scanning, not just the > metadata. > > On the other hand, I don't see any reason to remove these features from > ODF, since there may be applications that use them. > > I would not advocate their removal but I have posted notes in the latest drafts about the need to specify definitions to accompany these counts. Simply saying word and character count is insufficient once you move beyond English. Saying that they are locale specific would be better, although that does seem to push the question of interoperability off into locale. Which may be the best we can do. We could expend a lot of resources trying to duplicate what is already standardized for some locales and probably not do as good a job for locales where such standards don't exist. I do think we need to specify these and other features (sort ascending/descending for example) are explicitly locale specific. Personally I find that unsatisfactory as different applications will have different levels of locale support and that has a negative impact on interoperability. But if we have to draw a boundary around our concerns as a format, I think using locale is at least defensible if not really satisfactory. Hope you are having a great day! Patrick PS: I will be off line (traveling) most of the rest of today. Back online tomorrow. -- Patrick Durusau patrick@durusau.net Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]