[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [docbook-apps] Apostrophe in docbook document
Hi Ron On 26/01/10 20:42, Ron Catterall wrote: > Hi Dave > > Not sure why I got into this, but I'll push it along a bit. > > XML was designed to allow the storage of formatted text in a human and > machine readable state. > > When a human does the reading (of the XML text) he can see the &apos or > &rsquo character in context and guess pretty accurately whether it is an > indication of a missing character, a genitive marker or a closing quote. > So far I am with you all the way - it doesn't matter in English. And when the formatted output is presented to the human which Unicode code point is used is rarely material. > > Now look at machine reading: > Imagine a linguist wanting to search some text to count > 1. The use of contractions (e.g. isn't versus is not ). He wants to find > list and count all contractions. His text editor or little Perl script > (he doesn't know regex) looks for &rsquo and finds what he wants > corrupted by lots of extraneous closing strings and genitive markers. > The three logically different functions are represented by the same code. > 2. ditto except that this time he wants to find quoted strings > 3. ditto but this time his interest is in the grammar and he is > searching for genitives > 4. why he might want to distinguish between singular and plural > genitives is beyond me. But he might. My initial reaction is who the heck is going to mark this up - accurately and with the knowledge of English and Unicode to do a good job of it. Someone in Edinburgh perhaps? http://www.ling.ed.ac.uk/ > > I guess I just don't like one symbol with three meanings. Imagine this > in your code, you don't need = == and EQ, one symbol will handle all. Yep. I'm doing that to please the compiler writer I guess. > > The problem of course is not a Docbook problem, it is in the UTF tables > (and the linguist would probably be using TEI anyway, but it's not a TEI > problem either) Your proposal is a solution, not a problem Ron :-) > > In my case all my quotes in XML tags are done on the keyboard #x27, all > my text quotes are <quote>, all my apostrophe marks and genitives are > &apos so a simple global edit puts all to rights for me - now that I > know to use &rsquo Suggestion. If you're using Linux. Look into keyboard mappings and use... perhaps your numeric keypad to generate this 'suite' for you using a single keypress? Just a thought. regards regards -- Dave Pawson XSLT XSL-FO FAQ. http://www.dpawson.co.uk
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]