OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff-inline message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xliff-inline] BiDi draft for discussion

Hi Yves,

Thanks for the comments and feedback. Some answers and discussion bellow.

> -----Original Message-----
> From: xliff-inline@lists.oasis-open.org [mailto:xliff-inline@lists.oasis-
> open.org] On Behalf Of Yves Savourel
> Sent: den 24 april 2012 17:29
> To: xliff-inline@lists.oasis-open.org
> Subject: RE: [xliff-inline] BiDi draft for discussion
> Hi Fredrik,
> Many thanks for the thorough work you've done with this.
> A few questions/notes:
> -- The source/target-dir attributes on <file>, <unit>, <segment>, <ignorable>
> sound ok. This forces the processor to keep track on inheritance. But I
> suppose this is fine.
> A thought: What if we had no attributes until <unit> and the default there
> would be based on the languages? So you wouldn't even have to set
> anything except of the base direction was different that the default ones?
The direction is not really a property of the language but rather the script used. 
In most cases a language only use one script but it has often changed over time,
and some languages are even written in multiple different scripts. Another
implication I considered here was that I'm not aware of any canonical list of
languages and script/direction mapping standard to reference and it did not 
seem right that we should maintain one as part of XLIFF.

> -- We'll have to define some processing expectation for the result of a join of
> segments/ignorables: For example, if one segment is LTR and the next RTL
> how do we carry that in the joined content?
> -- No 'auto' value? This seems to have been added recently to dir. You think
> we don't need it?
To me auto seems to be added to cover cases where the HTML markup does 
not know what text it will contain in the end. Like an input box. Or dynamically 
added content. For XLIFF extraction / tagging we do not have dynamic content
and it felt more appropriate that the extractor apply the algorithm used by 'auto'
and set the dir to either ltr or rtl as appropriate. When editing the editor need to 
understand bidi anyway and can thus manage the direction itself too. If we
are translating from one dir to another the editors / processors need to 
understand basics of directionality anyway. So adding auto just seemed to put 
an unnecessary complication in the standard.
> Related to this: <bdo> or Unicode controls are equivalent, but what about
> the new <bdi> in HTML5? I don't think it has a Unicode control equivalent. I'm
> not sure but it looks like <bdi> would be equivalent to <bdo dir='auto'>
> (which you can't have because <bdo> requires dir to be set to either rtl or
> ltr).
The <bdi> element has very unique rendering implications. It starts a new fresh
paragraph and applies the Unicode BiDi algorithm from scratch. Then the result
is embedded like an image/object in the source run. As intended in the HTML
spec it is mainly for dynamic texts with unknown directionality. Adding this would 
mean that it is no longer possible to use just a simple Unicode, BiDi aware text 
control to render a sequence of Unicode characters. To implement <bdi> style
semantics the application would need to do higher level rendering manipulations
> So if you have an original code like this:
> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> ما اسمك؟</p>
> I assume we could represent it like this:
> <unit id='1' source-dir='auto'>
>  <segment>
>   <source><pc id='1'><pc id='2' dir='auto'>Teacher</pc>:</pc> ما
> اسمك؟</source>
>  </segment>
> </unit>
> No?
I think the HTML to XLIFF extractor should apply the 'auto' algorithm
and choose 'ltr' or 'rtl' for the spans. On back conversion the example
would keep 'auto' in the HTML. When translating the extractor and
back converters would need to handle the 'dir' attribute of HTML and
change it according to the requirement of the translation anyway. 

> --- I'm still fuzzy on disp-dir. Is that just a way to specify the directionality for
> the original data (regardless were they are stored)?
Yes, I added it so that we would be able to support displaying RTL markup
regardless of text flow direction. Although I have not seen it yet there will
be files with RTL language markup vocabulary like:

<?xml version="1.1" encoding="UTF-8"?>
<جذر>Samlple <كبير>test</كبير> doc</جذر>

So adding support to define the base direction for display of the native content
seemed like a useful feature.
> Thanks
> -yves

Fredrik Estreen

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]