OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita-translation message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Input on xml:lang and dir attribute for best practice recommendation

Hi all,

Here are the main points from a thread I've been having with Ray Lam, the
BIDI expert at Blast Radius. Let's discuss further at today's SC meeting.
I've arranged the thread so it runs from top to bottom for easier reading.

Best Regards,

-----From: Paul Prescod-----
I'd be glad to offer XMetaL's bidi expert for consultation on your
bidirectionality discussions if this is of value to you. We went to
considerable effort to research the best practices and state of the art
before implementing our bidirectionality support.

-----From: JoAnn Hackos-----
Could you put him in touch with Gershon so that he sees the debate going on
about the dir attribute? I've attached the last of the email threads.

-----From: Paul Prescod-----
... Ray knows what requirements we came up with based on our conversations
with customers, partners and experts and may have some thoughts to
contribute on the discussions.

-----From: Ray Lam-----
Gershon and I did discuss the use of the xml:lang attribute for inferring
directionality, and it seems to me that xml:lang could be useful for basic
support of the bidi algorithm, but it may break down if it was used to
handle the more complex cases of bidirectional text.
For instance, there might be cases where the default embedding level may not
be equivalent to that of the language. For instance, in a primarily Arabic
document there might be a single paragraph that is predominantly English
(starts and ends in English, but with some Arabic content nested within),
but the author of the article wishes the default embedding level to be
right-to-left so that it flows consistently with the rest of the article,
then an xml:lang attribute of "English" on that paragraph will contradict
with the wishes of the author. Using a dir attribute of rtl would meet his
particular needs.
Also, the use of the xml:lang by itself may not be sufficient to determine
directionality. Numeric sequences for instance are still rendered
left-to-right in languages like Hebrew and Arabic, and an xml:lang setting
of "Hebrew" would still impose a burden on the processor to understand that
numeric runs would be rendered in a direction counter to the natural
direction of the language. Tagging these numeric sequences with an xml:lang
of "English" or another left-to-right language just to get left-to-right
rendering would seem counter-intuitive to me.
One other thing, xml:lang does not have the notion of directional overrides
- probably arguable how useful this feature actually may be - but
nonetheless a feature of the bidi algorithm that would be difficult to
reproduce using just the xml:lang attribute.
So, in my opinion, the use of the xml:lang attribute will probably be useful
for translation purposes and possibly for basic support of directionality,
but I believe that using the dir attribute will be a cleaner, more powerful
and more explicit means for specifying directionality.
-----From: Gershon L Joseph-----
The DITA SC is not proposing use of xml:lang exclusively to determine
directionality. xml:lang and dir are a set of attributes required for
multilingual XML documents that work together -- using one and not the other
does not work well in practice (and using one or the other was never the
intention of their design). While XMetaL today does not use xml:lang, I
think we're all in agreement that it should support the attribute as defined
in the specs. It also makes the XML document accessible to non-XMetaL
An issue I expect to thrash out at today's SC meeting is the use of markup
in favor of the use of Unicode's invisible markers. The W3C prefers using
markup, which makes the document understandable to all XML processors
regardless of the processor's level of Unicode support, as well as
accessible to humans. Documents that use the Unicode markers make the
document suitable to computers only, and limited to those processors that
fully and correctly implement the Unicode bidi algorithm (which I think
boils down to only XMetaL and perhaps Antenna House at this time?). The
result of using the Unicode language markers or the xml:lang attribute to
determine the active language (script) and directionality is the same;
however the former is not portable and prone to error (due to lack of full
support in processors) while the latter is close to bullet-proof. I expect
the standards committees will prefer the latter.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]