[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff-inline] BiDi draft for discussion
Hi Yves, Thanks for the comments and feedback. Some answers and discussion bellow. > -----Original Message----- > From: xliff-inline@lists.oasis-open.org [mailto:xliff-inline@lists.oasis- > open.org] On Behalf Of Yves Savourel > Sent: den 24 april 2012 17:29 > To: xliff-inline@lists.oasis-open.org > Subject: RE: [xliff-inline] BiDi draft for discussion > > Hi Fredrik, > > Many thanks for the thorough work you've done with this. > > A few questions/notes: > > > -- The source/target-dir attributes on <file>, <unit>, <segment>, <ignorable> > sound ok. This forces the processor to keep track on inheritance. But I > suppose this is fine. > > A thought: What if we had no attributes until <unit> and the default there > would be based on the languages? So you wouldn't even have to set > anything except of the base direction was different that the default ones? > The direction is not really a property of the language but rather the script used. In most cases a language only use one script but it has often changed over time, and some languages are even written in multiple different scripts. Another implication I considered here was that I'm not aware of any canonical list of languages and script/direction mapping standard to reference and it did not seem right that we should maintain one as part of XLIFF. > > -- We'll have to define some processing expectation for the result of a join of > segments/ignorables: For example, if one segment is LTR and the next RTL > how do we carry that in the joined content? > > > -- No 'auto' value? This seems to have been added recently to dir. You think > we don't need it? > To me auto seems to be added to cover cases where the HTML markup does not know what text it will contain in the end. Like an input box. Or dynamically added content. For XLIFF extraction / tagging we do not have dynamic content and it felt more appropriate that the extractor apply the algorithm used by 'auto' and set the dir to either ltr or rtl as appropriate. When editing the editor need to understand bidi anyway and can thus manage the direction itself too. If we are translating from one dir to another the editors / processors need to understand basics of directionality anyway. So adding auto just seemed to put an unnecessary complication in the standard. > Related to this: <bdo> or Unicode controls are equivalent, but what about > the new <bdi> in HTML5? I don't think it has a Unicode control equivalent. I'm > not sure but it looks like <bdi> would be equivalent to <bdo dir='auto'> > (which you can't have because <bdo> requires dir to be set to either rtl or > ltr). > The <bdi> element has very unique rendering implications. It starts a new fresh paragraph and applies the Unicode BiDi algorithm from scratch. Then the result is embedded like an image/object in the source run. As intended in the HTML spec it is mainly for dynamic texts with unknown directionality. Adding this would mean that it is no longer possible to use just a simple Unicode, BiDi aware text control to render a sequence of Unicode characters. To implement <bdi> style semantics the application would need to do higher level rendering manipulations itself. > So if you have an original code like this: > > <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> ما اسمك؟</p> > > I assume we could represent it like this: > > <unit id='1' source-dir='auto'> > <segment> > <source><pc id='1'><pc id='2' dir='auto'>Teacher</pc>:</pc> ما > اسمك؟</source> > </segment> > </unit> > > No? I think the HTML to XLIFF extractor should apply the 'auto' algorithm and choose 'ltr' or 'rtl' for the spans. On back conversion the example would keep 'auto' in the HTML. When translating the extractor and back converters would need to handle the 'dir' attribute of HTML and change it according to the requirement of the translation anyway. > > --- I'm still fuzzy on disp-dir. Is that just a way to specify the directionality for > the original data (regardless were they are stored)? > Yes, I added it so that we would be able to support displaying RTL markup regardless of text flow direction. Although I have not seen it yet there will be files with RTL language markup vocabulary like: <?xml version="1.1" encoding="UTF-8"?> <جذر>Samlple <كبير>test</كبير> doc</جذر> So adding support to define the base direction for display of the native content seemed like a useful feature. > > Thanks > -yves > Regards, Fredrik Estreen
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]