[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Proposal for a simplified *-asian and *-complex attribute processing
Hi, as discussed in the last con call, the OpenOffice.org specification contains -asian and -complex versions of some attributes. An example for this is fo:font-family, where additionally an style:font-family-asian and style:font-family-complex exists. The behavior of these attributes is as follows: - the fo:font-family attribute is applied to all latin characters - the style:font-family-asian attribute is applied to CJK characters - the style:font-family-complex attribute is applied to CTL characters On the one hand, this means that an application that reads files has to know into which class or script type (latin, CJK or CTL) a character belongs to apply the correct attribute to it. On the other hand, this means that an application that saves a file but does support only a single font family setting has to create all three attributes. This does not only apply to fo:font-family, but to - style:font-name, fo:font-family, style:font-family-generic, style:font-style-name, style:font-pitch, style:font-charset style:font-pitch - fo:font-size, style:font-size-rel - fo:language, fo:country - fo:font-style, fo-font-weight In the following, these attributes are called script dependent, and an application that supports these script-dependent attributes is called an application that supports script types. The reason for having script dependent attributes is that it is in fact common to use different font and font sizes for the different script types and that creating documents that use multiple script types becomes much easier if fonts and font sizes are selected automatically based on the script type of the character that has been typed in. An issue however is that the UNICODE character set also contains "weak" characters that do not specify a script-type. For these characters, applications have to guess a type from the surrounding characters, the locale in use, or the user interface's input method. This means that applications in fact might choose different script-types for weak characters. This again means that documents may look different even in applications that both support script types. To simplify transformations from and to CSS/XSL-FO and other formats that don't have script-dependent attributes, and to also solve the issue that applications may choose different algorithms to assign script-types to weak UNICODE characters, I would like to propose to add a style:script-type=(latin|asian|complex|ignore) formatting property. This property can be used like any other formatting property in styles and specifies what script dependent attributes should be applied to some text. The attribute has to be evaluated by applications that do not support script types. Application that support script types may (or should) also evaluate the attribute and overwrite the script type they would evaluate for a certain character, but they don't have to. The attribute value "ignore" can be used only within a <style:default-style>. If it is set, all script-dependent attributes are applied to all script types. This would mean for example that a fo:font-family would be applied to all script types as well as a style:font-family-asian or style:font-family-complex. This simplifies saving documents with application that do not support a script type, because these applications otherwise would have to export all three script-dependent attributes for a single property. Example with script-type support: <office:document text:st="asian"> ... <style:style style:name="Text Body"> <style:properties fo:font-family="Times" style:font-family-asian="Tahoma" style:script-type="asian"/> </style:style> <style:style style:name="T1"> <style:properties style:script-type="latin"/> </style:style> ... <text:p text:style-name="Text Body"> [asian characters] <text:span text:style-name="T1">[latin characters]</text:span> [asian characters] </text:p> <text:p text:style-name="Text Body"> [asian characters] </text:p> <text:p text:style-name="Text Body"> [asian characters] <text:span text:style-name="T1">[latin characters]</text:span> [asian characters] </text:p> The same example without script-type support: <office:document> ... <style:default-style> <style:properties style:script-type="ignore"/> </style:style> <style:style style:name="Text Body"> <style:properties fo:font-family="Tahoma"/> </style:style> <style:style style:name="T1"> <style:properties fo:font-family="Times"/> </style:style> ... <text:p text:style-name="Text Body"> [asian characters] <text:span text:style-name="T1">[latin characters]</text:span> [asian characters] </text:p> <text:p text:style-name="Text Body"> [asian characters] </text:p> <text:p text:style-name="Text Body"> [asian characters] <text:span text:style-name="T1">[latin characters]</text:span> [asian characters] </text:p> An alternative would be to add formatting properties that specify the current values regardless of the script type, for instance by renaming the latin atributes to *-latin and by using the attributes without a suffix for the current value. This solves the transformation issue and in fact might make transformations to formats that don't have script-depdendent attributes even easier, but unfortunately, this solution would not solve the problem of weak UNICODE characters. Best regards Michael
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]