[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Proposal - dir attribute
Hi all, Please review my updated proposal. I've added best practice sections for users and vendors/implementers. Kevin will be adding a section on output/rendering expectations and best practice. We would like to finalize and approve our dir attribute proposal for submission to the DITA TC during Monday's SC meeting. Please post your feedback to the list. Thanks, Gershon --- Gershon L Joseph Member, OASIS DITA and DocBook Technical Committees Director of Technology and Single Sourcing Tech-Tav Documentation Ltd. office: +972-8-974-1569 mobile: +972-57-314-1170 http://www.tech-tav.comTitle: Dir Attribute Proposal
While most languages are written in text where characters flow from left to right, Hebrew and many Arabic languages are written from right to left. In some languages, including Hebrew and Arabic, numbers and other content is written left to right. Also, a multilingual document containing, for example, English and Hebrew, contains some text that flows left to right and other text that flows right to left.
Text directionality is controlled by the following:
In most cases, authors need to use dir="rtl|ltr" to ensure punctuation surrounding a RTL phrase inside a LTR element is rendered correctly. In order to override the direction of strongly typed Unicode characters (most characters that apply to a language except for punctuation, spaces and digits), the author would need to use dir="lro|rlo". The use of the dir attribute and the Unicode algorithm is clearly explained in the article [REF 1]. The referenced article has several examples on the use of dir="rtl|ltr". There is no example on the use of dir="lro|rlo", though it can be inferred from the example using the bdo element (the old W3C way of overriding the entire Unicode bidirectional algorithm; the now favor using the override values on the dir attribute).
Text direction cannot be sufficiently specified by the xml:lang attribute alone, because numeric and punctuation characters are input, and rendered, according to the Unicode bidirectional algorithm, which often cannot correctly determine the correct direction of the characters.
From the HTML 4.0 spec:
Add a new attribute called "dir", as follows:
This attribute, when set to "ltr" or "rtl", overrides the default Unicode bidirectional algorithm on neutral characters (such as spaces and punctuation). These values are usually used to ensure punctuation is applied correctly in a phrase.
This attribute, when set to "lro" or "rlo", overrides the default Unicode bidirectional algorithm on all characters. These values are usually used to force a direction on all characters contained in a phrase.
This attribute is usually used in conjunction with the xml:lang attribute, to override the default Unicode bidirectional algorithm that applies to the specified language.
This attribute is available on all elements within DITA.
Additional rules to be documented:
<p dir="ltr"> The Hebrew word for "Hebrew" is <ph xml:lang="he-il">עברית</ph>, but since Hebrew letters have intrinsic right-to-left directionality, I had to type the word starting from the letter "ע", i.e. <ph xml:lang="he-il" dir="lro">תירבע</ph>. </p>
Many good examples are provided in [REF 1].
While many of the issues can be resolved using the so-called Unicode control characters (hidden characters with strong directionality of either LTR or RTL), the W3C discourages use of the control characters (see [REF 1]). Our documentation of the dir attribute should probably include something like "When directionality issues can be resolved by either use of the dir attribute or use of Unicode control characters (LRM, RLM) , use of the dir attribute is strongly recommended."
The Unicode Bidirectional algorithm provides for various levels of bidirectionality, as follows:
For most authoring needs, the "ltr" and "rtl" values are sufficient. Only when the desired effect cannot be achieved using these values, should the override values be used.
While the Unicode standard includes hidden markers for directionality without the need for markup, these markers should not be used. It is strongly recommended to mark up the document using the dir attribute to set directionality. Using markup instead of the Unicode markers has the following advantages:
Applications that process DITA documents, whether at the authoring, translation, publishing, or any other stage, should fully support the Unicode algorithm to correctly implement the script and directionality for each language used in the document. The recommended practice is to write all directionality markers via XML markup and not to use the Unicode Bidirectional markers. When reading XML markup that embeds the Unicode Bidirectional markers, these markers should be replaced with markup when the document is saved.