[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Apostrophe in docbook document
Vincent Thanks for pointing me to the Unicode Standard 5.2 - I'll covert &apos to &rsquo Ron Hi Dave Not sure why I got into this, but I'll push it along a bit. XML was designed to allow the storage of formatted text in a human and machine readable state. When a human does the reading (of the XML text) he can see the &apos or &rsquo character in context and guess pretty accurately whether it is an indication of a missing character, a genitive marker or a closing quote. So far I am with you all the way - it doesn't matter in English. Now look at machine reading: Imagine a linguist wanting to search some text to count 1. The use of contractions (e.g. isn't versus is not ). He wants to find list and count all contractions. His text editor or little Perl script (he doesn't know regex) looks for &rsquo and finds what he wants corrupted by lots of extraneous closing strings and genitive markers. The three logically different functions are represented by the same code. 2. ditto except that this time he wants to find quoted strings 3. ditto but this time his interest is in the grammar and he is searching for genitives 4. why he might want to distinguish between singular and plural genitives is beyond me. But he might. I guess I just don't like one symbol with three meanings. Imagine this in your code, you don't need = == and EQ, one symbol will handle all. The problem of course is not a Docbook problem, it is in the UTF tables (and the linguist would probably be using TEI anyway, but it's not a TEI problem either) In my case all my quotes in XML tags are done on the keyboard #x27, all my text quotes are <quote>, all my apostrophe marks and genitives are &apos so a simple global edit puts all to rights for me - now that I know to use &rsquo Ron Dave Pawson wrote: > On 26/01/10 00:53, Ron Catterall wrote:> > > Beg to differ Ron, English appears not to require more than one? > Is it simply for your search needs? > > The only different one in your previous list is the prime symbol, U+2032. > The remainder should be the same. > > regards -- Ron Catterall Ph.D. D.Sc. ron@catterall.net http://catterall.net
S/MIME Cryptographic Signature
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]