OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [dita] [dita-translation] TC/DITA/Translation Subcommittee Proposals


Here are some additional notes on DIR 

-----Original Message-----
From: JoAnn Hackos [mailto:joann.hackos@comtech-serv.com] 
Sent: Tuesday, March 14, 2006 8:04 AM
To: Grosso, Paul; dita@lists.oasis-open.org
Subject: RE: [dita] [dita-translation] TC/DITA/Translation Subcommittee
Proposals

Paul et al,
The SC had a considerable debate about the LRO and RLO values. I've
enclosed the last series of emails for you and others to review. I've
also asked some of the experts in this to explain the decision.  

Unfortunately, Gershon Joseph cannot be on the call today. He has been
leading the discussions. He has also been testing Xmetal's new BIDI
release (or pre-release).

JoAnn

-----Original Message-----
From: Grosso, Paul [mailto:pgrosso@ptc.com] 
Sent: Tuesday, March 14, 2006 7:49 AM
To: JoAnn Hackos; dita@lists.oasis-open.org
Subject: RE: [dita] [dita-translation] TC/DITA/Translation Subcommittee
Proposals

 
> -----Original Message-----
> From: JoAnn Hackos [mailto:joann.hackos@comtech-serv.com] 
> Sent: Tuesday, 2006 March 14 8:26
> To: dita@lists.oasis-open.org
> Subject: [dita] [dita-translation] TC/DITA/Translation 
> Subcommittee Proposals
>  
> 
> From: JoAnn Hackos, chair DITA/Translation Subcommittee 
>  
> The DITA/Translation Subcommittee approved the following proposals to
> the DITA TC on March 13, 2006.
>  
> DIR Attribute
> Proposal: That the DITA 1.1 specification include the DIR 
> attribute as a
> universal attribute with the values of LTR, RTL, LRO, and RLO. No
> default value is to be specified for the DITA DTD.
>
>  
> Discussion: The DIR attribute is used by authors of languages such as
> Hebrew and Arabic to ensure that correct directionality on the output,
> especially when the standard directionality has to be modified to
> accommodate some special use of the language. The reason to include it
> is to ensure that tools for authoring and for transforms generate the
> correct directionality. There was discussion that the results of this
> would often be unpredictable and produce different effects 
> for different
> browers. The SC will now work on a statement of best practices for
> authors and tools vendors to develop a way to handle the dir attribute
> properly.
>  

I agree that some more explanation will be necessary
before we can agree to put this in the DITA spec.  

In particular, while most people may know about LTR and 
RTL (since it is part of HTML), many may not know what 
the processing expectations are for LRO and RLO (unless 
they get into the details of bidi-override in either the 
CSS or XSL-FO specifications).  The subject of language 
direction and bidi-override is complex enough without 
forcing people to read the entire CSS or XSL-FO spec plus 
the Unicode spec just to be able to use the DITA dir attribute.

paul




--- Begin Message ---
Hi all,

Here is my revised proposal. Please could you review it and discuss any
issues via email this week so we can resolve any issues before the next SC
meeting.

Best Regards,
Gershon

---
Gershon L Joseph
Member, OASIS DITA and DocBook Technical Committees
Director of Technology and Single Sourcing
Tech-Tav Documentation Ltd.
office: +972-8-974-1569
mobile: +972-57-314-1170
http://www.tech-tav.com

Title: Dir Attribute Proposal

Dir Attribute Proposal


1. Background

While most languages are written in text where characters flow from left to right, Hebrew and many Arabic languages are written from right to left. In some languages, including Hebrew and Arabic, numbers and other content is written left to right. Also, a multilingual document containing, for example, English and Hebrew, contains some text that flows left to right and other text that flows right to left.

Text directionality is controlled by the following:

  1. xml:lang attribute on the document element or, if not specified, default language assumed by the processor. Directionality is determined by the Unicode bidirectional algorithm for this language.

  2. xml:lang attribute on any element that overrides the inherited language. Again, directionality is determined by the Unicode bidirectional algorithm for the specified language.

  3. dir="ltr|rtl" attribute on an element that overrides the inherited direction (as determined by dir on a parent element or either specified or inferred xml:lang on a parent element). The specified direction overrides the Unicode bidirectional algorithm only on neutral Unicode characters (e.g. spaces and punctuation) in the element's content.

  4. dir="lro|rlo" attribute on an element. The specified direction overrides the Unicode bidirectional algorithm on all Unicode characters in the element's content.

In most cases, authors need to use dir="rtl|ltr" to ensure punctuation surrounding a RTL phrase inside a LTR element is rendered correctly. In order to override the direction of strongly typed Unicode characters (most characters that apply to a language except for punctuation, spaces and digits), the author would need to use dir="lro|rlo". The use of the dir attribute and the Unicode algorithm is clearly explained in the article [REF 1]. The referenced article has several examples on the use of dir="rtl|ltr". There is no example on the use of dir="lro|rlo", though it can be inferred from the example using the bdo element (the old W3C way of overriding the entire Unicode bidirectional algorithm; the now favor using the override values on the dir attribute).

Text direction cannot be sufficiently specified by the xml:lang attribute alone, because numeric and punctuation characters are input, and rendered, according to the Unicode bidirectional algorithm, which often cannot correctly determine the correct direction of the characters.

From the HTML 4.0 spec:

The dir attribute specifies the directionality of text: left-to-right (dir="ltr", the default) or right-to-left (dir="rtl"). Characters in Unicode are assigned a directionality, left-to-right or right-to-left, to allow the text to be rendered properly. For example, while English characters are presented left-to-right, Hebrew characters are presented right-to-left. Unicode defines a bidirectional algorithm that must be applied whenever a document contains right-to-left characters. While this algorithm usually gives the proper presentation, some situations leave directionally neutral text and require the dir attribute to specify the base directionality. Text is often directionally neutral when there are multiple embeddings of content with a different directionality. For example, an English sentence that contains a Hebrew phrase that contains an English quotation would require the dir attribute to define the directionality of the Hebrew phrase. The Hebrew phrase, including the English quotation, would be contained within a ph element with dir="rtl".

2. Specification changes

Add a new attribute called "dir", as follows:

dir="ltr|rtl|lro|rlo"

This attribute, when set to "ltr" or "rtl", overrides the default Unicode bidirectional algorithm on neutral characters (such as spaces and punctuation). These values are usually used to ensure punctuation is applied correctly in a phrase.

This attribute, when set to "lro" or "rlo", overrides the default Unicode bidirectional algorithm on all characters. These values are usually used to force a direction on all characters contained in a phrase.

This attribute is usually used in conjunction with the xml:lang attribute, to override the default Unicode bidirectional algorithm that applies to the specified language.

This attribute is available on all elements within DITA.

Additional rules to be documented:

  • When the dir attribute is set on an element, it remains in effect for the duration of the element and all child elements. Setting the dir attribute on a nested element overrides the inherited value.

  • If the document element does not specify the dir attribute, then if the document element specifies the xml:lang attribute, the Unicode Bidirectional Algorithm must be applied to the specified language. If neither xml:lang nor dir attributes are set on the document element, the processor must assume a language and the direction must be inferred from the Unicode Bidirectional Algorithm applied to the default language.

  • The dir attribute can also be used to specify the direction of non-textual content, such as tables and lists. In the case of <table dir="rtl">, the columns flow from right to left. In the case of <ul dir="rtl"> or <ol dir="rtl">, the list decoration (bullets or numbers) appear on the right of the screen/page and the <li> content flows from right to left.

Example:

<p dir="ltr">
The Hebrew word for "Hebrew" is <ph xml:lang="he-il">עברית</ph>,
but since Hebrew letters have intrinsic right-to-left directionality,
I had to type the word starting from the letter "ע",
i.e. <ph xml:lang="he-il" dir="lro">תירבע</ph>.
</p>

Many good examples are provided in [REF 1].

While many of the issues can be resolved using the so-called Unicode control characters (hidden characters with strong directionality of either LTR or RTL), the W3C discourages use of the control characters (see [REF 1]). Our documentation of the dir attribute should probably include something like "When directionality issues can be resolved by either use of the dir attribute or use of Unicode control characters (LRM, RLM) , use of the dir attribute is strongly recommended."

--- End Message ---


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]