Subject: Re: [office-comment] marking directionality of text inside a paragraph

On Sun, 2011-10-23 at 12:35 +0200, Thorsten Behrens wrote:
> robert_weir@us.ibm.com wrote:
> > The ODF 1.2 (this was in 1.1 as well) spec, Appendix E1 of Part 1 covers 
> > Bidi text.
> > 
> Amir E. Aharoni wrote:
> > Nothing that involves Unicode control characters can be described
> > as "easily".
> > 
> I tend to agree with Amir - albeit sufficient in expressiveness,
> that's still an area in ODF that's a bit inconsistent, and involves
> extra effort for 'simple' applications.
> Beyond that, getting lower impedance towards/from HTML is desirable
> - if it doesn't break anything.

At the risk of confusing things, focusing on directionality alone might
be a bit of a red herring. There are three script categories in ODF
IIRC, Latin, Asian and Complex. And all text is divided into four
categories, Latin, Asian, Complex and Weak.

Problems arise when encountering Weak characters, e.g. spaces,
punctuation and mathematical symbols. They generally get assigned to one
of the other three categories depending on context of surrounding text.
There isn't a way to override the script-category they get assigned to I
think ?, or is there ?

So, one example scenario is a document comprising of a paragraph that
consists of only weak characters, something like .:?". There isn't a way
to state that these weak characters should be biased towards one script
category or another. If you open that in a version of
LibreOffice/OpenOffice.org then the final fallback is to bias towards
the locale the user is in, i.e. a Japanese user gets .:?" shown in their
CJK font, and a Europan user gets .:?" shown in their Western font, so
the same document isn't rendered the same for different locales.

i.e. if you select your problematic text and change e.g. Western font
size to 50 and CTL to 25, what size is the misplaced ! drawn in?, has
the ! been categorized as a 25pt CTL character or a 50pt Latin

We don't have a "script bias" or "idcthint" feature to force weak
characters into one script of another. I believe that OpenXML has
something of that nature.

i.e. given a weak character such as a bare ! or space, how do we specify
that it should be rendered using the Asian font, the Complex font or the
Western font, if we want to override the neighbouring strong-characters
script ?


