OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Size and Length restrictions module


Hi All,

 

First my apologies for submitting this so late before the call. But here is my current draft of a specification for this module. It is rather lengthy and I’d be happy to provide it in PDF or DOCX format if that is preferred to review it. There are some open issues such as to what namespaces to use, what the processing requirements will go in core and what profiles we should make mandatory or not. There are probably many more points that we need to discuss or modify beyond those.

 

Best regards,

Fredrik Estreen

 

XLIFF 2.0 size and length checking module

 

Summary
This module is designed to facilitate the implementation of length and size restrictions in XLIFF documents. Since the field is large and there is a lot of use cases that cannot be covered by a fully interoperable open standard this module tries to define attributes, elements and processing requirements that can fit a multitude of unknown use cases. It also defines a set of rules to cover the most common cases. Included are also non normative suggestions of how other schemes could be implemented within the framework. In order to simplify the implementation in tools while still relying on third party extensibility the processing requirements place additional restrictions on these extensions compared to the general extensibility found in XLIFF 2.0.

 

Design
The restriction framework has support for two distinct types of restrictions; storage size restrictions and general size restriction. The reason for this is that it is often common to have separate restrictions between storage and display / physical representation of data. Since it would be impossible to define all restrictions here a concept of restriction profile is introduced. The profiles for storage size and general size are independent. The information related to restriction profiles are stored in the processing invariant part of the XLIFF file like the <file>, <group> and <unit> elements and contained within elements defined in this module. The information regarding the specific restrictions are stored on the processing invariant parts and on the inline elements as attributes or attributes referencing data in the elements defined in this module. To avoid issues with segmentation no information regarding size restrictions is present on <segment>, <source> or <target> elements. The module defines a namespace for all the elements and attributes it introduce, in the rest of the specification this will be denoted by the prefix ”slr” (short for Size and Length Restrictions). For clarity the prefix ”xliff” will be used for core XLIFF elements and attributes.

Profile names use the same namespace like naming conventions as parts of the XLIFF Core specification. The name should be composed of two components separated by a colon. <authority>:<name>. The authority ”xliff” is reserved for profiles defined by the OASIS XLIFF Technical Committee.

 

Element <slr:profiles>
Occurs: 0 or 1 times in <xliff:file>
Attributes: 1 slr:storage-profile (default empty) and 1 slr:general-profile (default empty)
Children: 0 or 1 <slr:normalization> elements, 0 or more third party extension elements

This element selects the restriction profiles to use in the document. If no storage or general profile is specified the default values (empty) of those elements will disable restriction checking in the file. Any overall configuration or settings related to the selected profile MUST be placed in child elements of this element.

 

Element <slr:data>
Occurs: 0 or more times in <xliff:file>, <xliff:group> and <xliff:unit>
Attributes: 1 slr:profile (mandatory, non-empty), 0 or more third party attributes
Children: 0 or more third party extension elements

This elements act as a container for data needed by the specified profile to check the part of the XLIFF document that is a sibling or descendant of a sibling of this element. It is not used by the default profiles. Third party profiles MUST place all such data in this element instead of using other extension points if the data serves no other purpose in the processing of the document.

 

Attribute slr:storage-profile
Occurs: <slr:profiles>
Value: name of storage profile to apply as storage size restriction. Default empty meaning no restrictions should be applied.

 

Attribute slr:general-profile
Occurs: <slr:profiles>
Value: name of storage profile to apply as general size restriction. Default empty meaning no restrictions should be applied.

 

Attribute slr:profile
Occurs: <slr:data>
Value: name of a storage profile to which the data in the element and its children apply.

 

Attribute slr:storage-restriction:
Occurs: optional on <xliff:file>, <xliff:group>, <xliff:unit>, <xliff:mrk> and <xliff:sm>
Value: Interpretation of the value is dependent on selected storage-profile It MUST represent the restriction to apply to the indicated sub part of the document.

This attribute specifies the restriction to apply to the collection descendants of the attribute it is defined on.

 

Attribute slr:size-restriction:
Occurs: optional on <xliff:file>, <xliff:group>, <xliff:unit>, <xliff:mrk> and <xliff:sm>
Value: Interpretation of the value is dependent on selected general-profile. It MUST represent the restriction to apply to the indicated sub part of the document.

This attribute specifies the restriction to apply to the collection descendants of the attribute it is defined on.

 

Attribute slr:equiv­-storage
Occurs: optional on <xliff:cp>, <xliff:ph>, <xliff:pc>, <xliff:sc> and <xliff:ec>
Value: Interpretation of the value is dependent on selected general-profile. It MUST represent the equivalent storage size represented by the inline element.

 

Attribute slr:size-info
Occurs: optional on <xliff:file>, <xliff:group>, <xliff:unit>, <xliff:cp>, <xliff:ph>, <xliff:pc>, <xliff:sc> and <xliff:ec>
Value: Interpretation of the value is dependent on selected general-profile. It MUST represent information related to how the element it is attached to contributes to the size of the text or entity in which it occurs or represents. It can be used on both inline elements and structural elements to provide information on things like GUI dialog or control sizes, expected padding or margins to consider for size and so on.
Restriction: At most one of this attribute and slr:size-info-ref can occur. Both cannot be specified at the same time.

 

Attribute slr:size-info-ref
Occurs: optional on <xliff:file>, <xliff:group>, <xliff:unit>, <xliff:cp>, <xliff:ph>, <xliff:pc>, <xliff:sc> and <xliff:ec>
Value: A reference to data that provide the same information that could be put in a slr:size-info attribute. The reference MUST point to an element in an <slr:data> element that is a sibling to the element this attribute is attached to or a sibling to one of its ancestors.
Restriction: At most one of this attribute and slr:size-info can occur. Both cannot be specified at the same time.

 

Processing requirements
This module relies on the processing requirements in the Core XLIFF specification that tools MUST preserve module information. In addition to those requirements this module allows some modification and removal of information. The module also rely on the processing requirements for inline elements when populating elements from source in target and creating additional elements from existing elements for use in target.

·         The tool adding the size restriction information to the document is free to also remove or modify it at a later time. This also applies to all other tools that are part of the initial tools suite and can be expected to have deep knowledge of the native format or initial tools requirements.

·         If and only if the file is not going to be returned to the initial tool or back converted to the native format any tool MAY remove ALL elements (and their descendants) and attributes defined by this module. IF ANY element or attribute is removed ALL elements and attributes MUST be removed. If there is doubt about the further use of the file the information MUST be preserved as stated by the Core processing requirements.

The reason for the second rule is that it might be useful to trim possibly large size restriction information before the file is passed to processes that will not have a use for it. This could without limitation be actions such as linguistic quality analysis or training of machine translation engines.

Standard profiles
This modules define a set of standard profiles for restrictions on text. Since all text in XLIFF documents are in Unicode the standard profiles only deal with Unicode text to reduce the complexity to implement the mandatory aspects of this module. Unicode define a number of normalization of text and to support that the standard profiles make use of an element with attributes to indicate what normalization, if any, is expected.

 

Normalization
Any normalization is done in accordance with Unicode Standard Annex #15 (http://unicode.org/reports/tr15/). The standard profiles only support three modes: no normalization, Normalization Form C and Normalization Form D. Normalization should be applied to all text from the maximum consecutive sequence of PCDATA and/or CDATA. Normalization should not be performed across elements including <xliff:cp>.

 

Element <slr:normalization>
Occurs: 0 or 1 times in <slr:profiles>
Attributes: slr:storage (default ”none”) and slr:general (default ”none”)
Children: No children

This element is used to hold the attributes specifying normalization form to apply to storage and size restrictions defined in the standard profiles. Other profiles may use this element in its specified form but may not add new extensions to it. If this element is not present no normalization should be performed.

 

Attributes slr:storage and slr:general
Occurs on: <slr:normalization>
Values: ”none”, ”nfc” or ”nfd”. Default value ”none”

”none” means that no normalization should be performed.
”nfc” means that the text should be normalized to Normalization Form C before applying restrictions.
”nfd” means that the text should be normalized to Normalization Form D before applying restrictions.

slr:storage specifies the normalization to apply for storage size restrictions.
slr:general specifies the normalization to apply for general size restrictions.

 

General restriction profile ”xliff:codepoints”
This profile implements a simple string length restriction based on the number of Unicode code points. It is possible to specify if normalization should be applied using the <slr:normalization> element and the slr:general attribute.

This profile make use of the following attributes from this module:

 

Attribute slr:size-restriction:
The value of this attribute holds the ”maximum” or ”minimum and maximum” size of the string. Either size must be an integer. The maximum size can also be ’*’ to denote that there is no maximum restriction. If only a maximum is specified it is implied that the minimum is 0 (empty string). The format of the value is the optional minimum size and a coma followed by a maximum size (”[minsize,]maxsize”). The default value is ’*’ which evaluates to a string with unbounded size.

 

Attribute slr:size-info
The value of this attribute is an integer representing how many code points the element it is set on should be considered to contribute to the total size. If empty the default for all elements except <xliff:cp> is 0. For <xliff:cp> the default value is 1.

 

Storage restriction profiles ”xliff:utf8”, ”xliff:utf16” and ”xliff:utf32”
These three profiles define the standard size restriction profiles for the common Unicode character encoding schemes. It is possible to specify if normalization should be applied using the <slr:normalization> element and the slr:storage attribute. All sizes is represented in 8bit bytes.

The size of text for these profiles is the size of the text converted to the selected encoding without any byte order marks attached. The encodings are specified by the Unicode Consortium in chapter 2.5 of the Unicode specification (http://www.unicode.org/versions/Unicode6.2.0/ch02.pdf)

”xliff:utf8” – The number of 8bit bytes needed to represent the string encoded as UTF-8 as specified by the Unicode consortium.

”xliff:utf16” – The number of 8bit bytes needed to represent the string encoded as UTF-16 as specified by the Unicode consortium.

”xliff:utf32” – The number of 8bit bytes needed to represent the string encoded as UTF-32 as specified by the Unicode consortium.

These profiles make use of the following attributes from this module:

 

Attribute slr:storage-restriction:
The value of this attribute holds the ”maximum” or ”minimum and maximum” size of the string. Either size must be an integer. The maximum size can also be ’*’ to denote that there is no maximum restriction. If only a maximum is specified it is implied that the minimum is 0 (empty string). The format of the value is the optional minimum size and a coma followed by a maximum size (”[minsize,]maxsize”). The default value is ’*’ which evaluates to a string with unbounded size.

 

Attribute slr:equiv­-storage:
The value of this attribute is an integer representing how many bytes the element it is set on should be considered to contribute to the total size. If empty the default is 0 for all elements except <xliff:cp>. The code point encoded <xliff:cp> is always converted to its representation in the profiles encoding and the size of that representation is used as the size contributed by the <xliff:cp>. If this attribute is specified on a <xliff:cp> element its value MUST match the calculated value.

 

Third party profiles
The general structure of this module together with the extensibility mechanisms provided, should lend itself to most size restriction schemes. For example to represent two dimensional data a profile might adopt using a coordinate style for the values of the general restriction attributes. Like ”{x,y}” to represent width and height or ”{{x1,y1},{x2,y2}}” to represent a bounding box. It should also be possible to embed information necessary to drive a display simulator and attach that data to text to do device specific checking. Or to provide font information and to glyph based general size checking.

 

Conformance
To claim conformance to the XLIFF size and length restriction module a tool must meet the following criteria.

·         MUST Follow the schema of the XLIFF Core specification and the extensions provided in this module

·         MUST follow all processing requirements set forth in this module specification regarding the general use of elements and attributes

·         MUST support all standard profiles with normalization set to ”none”

·         SHOULD support all standard profiles with all modes of normalization

·         MAY support additional third party profiles for storage or general restrictions

·         MUST provide at least one of the following

o   Add size and length restriction information to a document

o   If it supports the specified profile(s) in the document it MUST provide a way to check if the size and length restrictions in the document are met according to the profile(s)

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]