[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: DOCBOOK: Attributes for text direction and language
> Thank you very much for your reply. You are right, but the problem > is that numbers are neither LTR nor RTL intrincically, and are > interpreted either LTR or RTL depending on the direction of the > text preceding them. > I just took at look at the unicode bidirectional algorithm as described in: http://www.unicode.org/unicode/reports/tr9/. It is very complex and I don't claim to understand it in its full glory, but there are codes that provide explicit directional information, and the handling of numbers is covered in some detail in the algorithm. I can't say for sure if it will handle what you're doing, but since Hebrew was explicitly considered in the development of the algorithm, I think there's a good chance it will do what you need (if browsers and stylesheets use the algorithm). > > I asked what I asked because I'm planning to convert my documents, > mostly in the field of (Hebrew) linguistics, from Word format to > DocBook XML. Though DocBook may not be originally meant for > linguistic documentation originally, I thought (and still think) it > might be usable for this purpose without customizing its DTD. > I think you're right and that you should be able to do this without customizing the DTD. You may want to check out the unicode organization's web site. They have a lot of information and at least one archived mailing list that might give you some information if you decide to pursue a unicode approach. Good luck, Dick > > -----Original Message----- > From: Tsuguya Sasaki [mailto:ts@ts-cyberia.net] > Sent: Monday, June 17, 2002 2:45 PM > To: DOCBOOK > Subject: Re: DOCBOOK: Attributes for text direction and language > > > > It's been a while since I looked at this, so take > > it with a grain of salt, but I think that if you > > use Unicode (or UTF8), the codeset itself provides > > the information you need to render right to left > > and left to right text, including some characters > > that act as cues for cases where there might be > > ambiguity. There is a technical report from the > > unicode consortium that discusses this: > > > > http://www.unicode.org/unicode/reports/tr9/ > > > > While that seems like the cleanest way to handle > > RTL and LTR text, I don't know whether browsers > > and/or stylesheets use this information. > > Thank you very much for your reply. You are right, but the problem > is that numbers are neither LTR nor RTL intrincically, and are > interpreted either LTR or RTL depending on the direction of the > text preceding them. > > So things can become quite complex and messed up, for example, > when you try start an LTR paragraph with RTL text chunks followed > by numbers; it's interpreted as an RTL paragraph, numbers come to > the right of the RTL text chunks, which is the correct visual order in > RTL paragraphs. > > I asked what I asked because I'm planning to convert my documents, > mostly in the field of (Hebrew) linguistics, from Word format to > DocBook XML. Though DocBook may not be originally meant for > linguistic documentation originally, I thought (and still think) it > might be usable for this purpose without customizing its DTD. > > Tsuguya Sasaki > http://www.ts-cyberia.net/ > > PS: Please reply only to the list. If you send a reply to me > and to the > list, I receive two copies of the same message. > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC