[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: MEETING MINUTES -- 3 April 2006 -- DITA TRANSLATION SUBCOMMITTEE
Hi all, Here is the agenda from today's meeting, as well as the final dir and xml:lang proposals updated in accordance with today's meeting. I'll submit our proposals shortly to the DITA TC for inclusion in tomorrow's TC meeting. Best Regards, Gershon --- Gershon L Joseph Member, OASIS DITA and DocBook Technical Committees Director of Technology and Single Sourcing Tech-Tav Documentation Ltd. office: +972-8-974-1569 mobile: +972-57-314-1170 http://www.tech-tav.com
MEETING MINUTES -- 3 April 2006 -- DITA TRANSLATION SUBCOMMITTEE (Minutes taken by Gershon Joseph <gershon@tech-tav.com>) Date: Monday, 3 April 2006 Time: 08:00 - 09:00 PST DITA Translation Subcommittee resources: - SC Web site: http://www.oasis-open.org/apps/org/workgroup/dita-translation/index.php - Mailing list: dita-translation@lists.oasis-open.org - Non-OASIS members please email Gershon or Don and we'll post on your behalf Roll Call - Present: Robert, Sandi, Don, Kevin, Nancy, Richard Ishida, Gershon, Felix, Yves - Regrets: JoAnn Review/approve minutes from previous meeting (27 March 2006) - http://lists.oasis-open.org/archives/dita-translation/200603/msg00048.html - Robert moves to accept the minutes as read, Don accepts, no objections, approved by acclamation. Action Items: - Discussion of the dir attribute proposal (Gershon Joseph, Kevin Farwell, Richard Ishida) Use cases for the xml:lang attribute (submitted by JoAnn Hackos) New Business: The goal is to close on these proposals based on the discussion this week and submit the detailed explanations to the DITA TC for the 1.1 specification. - xml:lang attribute proposal: - Gershon read changed sections - Richard -- locale should be optional throughout the proposal. - Robert -- "...the element containing text in the alternate language..." should be "...the element containing the text or structure in the alternate language..." --ACTION-- Gershon to make the above two changes to the proposal and submit the proposal to DITA TC for discussion and approval. - dir attribute proposal: - Gershon read changed sections - Don -- There is no <ditabase> element. Proposal should say <dita> element in the ditabase DTD. - Richard -- Default value should be "ltr" rather than leaving it to application vendors. - Robert -- Should be similar to xml:lang, where it's left open --DISCUSSION... - Consensus to leave as-is. - Richard -- in the example markup, remove "-il" from xml:lang value and add ! to the Hebrew phrase to make the example need the dir attribute (without the exclamation mark the default BIDI algorithm does not need any dir attribute setting to help it along). --ACTION-- Gershon to make the above two changes to the proposal and submit the proposal to DITA TC for discussion and approval. --Meeting adjourned--Title: Proposal for dir Attribute
While most languages are written in text where characters flow from left to right, Hebrew and many Arabic languages are written from right to left. In some languages, including Hebrew and Arabic, numbers and other content is written left to right. Also, a multilingual document containing, for example, English and Hebrew, contains some text that flows left to right and other text that flows right to left. Text directionality is controlled by the following:
In most cases, authors need to use dir="rtl|ltr" to ensure punctuation surrounding a RTL phrase inside a LTR element is rendered correctly. In order to override the direction of strongly typed Unicode characters (most characters that apply to a language except for punctuation, spaces and digits), the author would need to use dir="lro|rlo". The use of the dir attribute and the Unicode algorithm is clearly explained in the article [REF 1]. The referenced article has several examples on the use of dir="rtl|ltr". There is no example on the use of dir="lro|rlo", though it can be inferred from the example using the bdo element (the old W3C way of overriding the entire Unicode bidirectional algorithm; the now favor using the override values on the dir attribute). From the HTML 4.0 spec:
Add a new attribute called "dir", as follows: dir="ltr|rtl|lro|rlo" This attribute, when set to "ltr" or "rtl", overrides the default Unicode bidirectional algorithm on neutral characters (such as spaces and punctuation). These values are usually used to ensure punctuation is applied correctly in a phrase. This attribute, when set to "lro" or "rlo", overrides the default Unicode bidirectional algorithm on all characters. These values are usually used to force a direction on all characters contained in a phrase. This attribute is often used in conjunction with the xml:lang attribute, which specifies the script to be used for the specified language. This attribute is available on all elements within DITA, except for <dita>. Additional rules to be documented:
Example: <p dir="ltr"> The Hebrew word for "Hebrew" is <ph xml:lang="he" dir="rtl">!עברית</ph>, but since Hebrew letters have intrinsic right-to-left directionality, I had to type the word starting from the letter "ע", i.e. <ph xml:lang="he" dir="lro">תירבע!</ph>. </p> Many good examples are provided in [REF 1]. When directionality issues can be resolved by either use of the dir attribute or use of Unicode control characters (LRM, RLM) , use of the dir attribute is strongly recommended. The Unicode Bidirectional algorithm provides for various levels of bidirectionality, as follows:
For most authoring needs, the "ltr" and "rtl" values are sufficient. Only when the desired effect cannot be achieved using these values, should the override values be used. While the Unicode standard includes hidden markers for directionality without the need for markup, these markers should not be used. It is strongly recommended to mark up the document using the dir attribute to set directionality. Using markup instead of the Unicode markers has the following advantages:
Users should be aware that descriptive markup isn’t necessarily the end of their work. Each possible output rendition or display tool may have different requirements for managing bidirectional text. Just as different HTML browsers offer different levels of support for CSS, different output tools implement the bidirectional algorithm, and its accompanying directional controls, differently. For example, HTML displayed in Internet Explorer may have different requirements than HTML displayed in Firefox. Similarly, a control that works in one part of an HTML file, such as the body of the page, might not work in another, such as the title or the index in compiled HTML Help. The same uncertainty can be found in almost any output. PostScript or PDF rendering tools treat bidirectional text differently. Microsoft Word and OpenOffice Writer don’t handle bidirectional RTF in the same way. Flash has little concern for directional markup of any kind, but does format strings according to the Unicode algorithm. Because input is unpredictably dependent on eventual output, it is not sufficient to apply the “dir” attribute in such a way as to make the XML appear as it should in an editor. Additional care must be taken to make sure that markup is correctly transformed (or added to the source XML, if needed), with respect both to the target output format and the target output tool. To use the case of HTML, this could mean creating output tailored to the capabilities of the most common likely browser or creating output tailored to the least capable browser and ensuring the markup functions for the most likely and capable one. For example, bidirectional HTML that displays perfectly in Internet Explorer might not display correctly in Safari. However, if the HTML displays perfectly in Safari, chances are very good it will display correctly in Internet Explorer as well. This isn’t a certainty, however. Each case should be tested and confirmed by qualified individuals. Applications that process DITA documents, whether at the authoring, translation, publishing, or any other stage, should fully support the Unicode algorithm to correctly implement the script and directionality for each language used in the document. The recommended practice is to write all directionality markers via XML markup and not to use the Unicode Bidirectional markers. When reading XML markup that embeds the Unicode Bidirectional markers, these markers should be replaced with markup when the document is saved. Applications should ensure every highest level topic element and the root map element explicitly assign the dir attribute. |
Specifies the language (and optionally the locale) of the element content. The intent declared with xml:lang is considered to apply to all attributes and content of the element where it is specified, unless overridden with an instance of xml:lang on another element within that content. When no xml:lang value is supplied, the processor should assume a default value. This attribute must be set to a language identifier, as defined by IETF RFC 3066 (http://www.ietf.org/rfc/rfc3066.txt) or successor. For a DITA document that contains a single language, the highest level element containing content should always set the xml:lang attribute to the language (and optionally the locale) that applies to the document. Since the ditabase element does not support the xml:lang element, the highest level element that should set the xml:lang attribute is the topic element (or derivatives at the same level). For a DITA document that contains more than one language, the highest level element should always set the xml:lang attribute to the primary language (and optionally the locale) that applies to the document. Wherever an alternate language occurs in the document, the element containing the text or structure in the alternate language should set the xml:lang attribute appropriately. The above way of overriding the default document language applies to both block and inline elements that use the alternate language. While the Unicode standard provides for all languages to be encoded without the need for markup, using markup is strongly recommended to make the document as portable as possible. By using markup, the document can be processed by applications that do not fully implement the Unicode standard. In addition, the marked-up document can be read and understood by humans. Finally, when updating the document, the boundaries of each language are clear, which makes it much easier for the author to update the document. The xml:lang attribute can be specified on the map element. The expected language inheritance behavior on the map is similar to that on the topic. That is, the primary language for the map should be set on the map element (or assumed by the application if not explicitly set), and should remain in effect for all children unless a child specifies a different value for xml:lang. In the case of a contradiction between the xml:lang value set on the map and the xml:lang value set on the topic, the setting on the topic overrides. Technical manuals frequently contain entire topics that are in languages different from the primary source languages of most of the topics. A manual in English, for example, may contain warnings that are in multiple languages, or have multiple topics of warnings each in individual languages. A manual may also contain regulatory notices as individual topics in different languages. Therefore, a map might reference topics that are written in more than one language. In this case, each topic (or section within the topic) would use the xml:lang attribute to specify the language of the topic or section. Processors identify the language of each topic or section by the xml:lang attribute set in the topic file. However, it may be useful to specify the xml:lang attribute at the map level (on topicref elements) to help identify the language of each topic the map refers to. Applications that process DITA documents, whether at the authoring, translation, publishing, or any other stage, should fully support the Unicode algorithm to correctly implement the script for each language used in the document. The recommended practice is to identify every change in language via XML markup. When reading XML markup that embeds the Unicode script information (that is, a change in language), the embedded languages should be indicated via markup when the document is saved. Applications should ensure every highest level topic element and the root map element explicitly assign the xml:lang attribute. |
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]