List Home Dates Threads Authors Subjects
xliff - RE: [xliff] Re: ITS module section(s) in the specification Message Thread: Previous | Next
  • From: Yves Savourel <ysavourel@enlaso.com>
  • To: "'XLIFF Main List'" <xliff@lists.oasis-open.org>
  • Date: Wed, 26 Nov 2014 05:56:43 -0700
Hi David, all,

>> If the module does not seem to have enough normative information,
>> I am open to hear what it should contain in your and TC opinion.
>
> I'll try to provide a mock-up, maybe it'll clearer.

Please see the attached file. It tries to show what I meant by having normative descriptions for each data categories in the ITS module and the information appendix describing the extraction guidelines.

I've used text from Felix's draft in the wiki (http://www.w3.org/International/its/wiki/XLIFF_2.0_Mapping#General_considerations_for_ITS_2.0_and_XLIFF) and added some.

The appendix would be very similar to what we have now, except it could be simplified a lot in some cases. For example, when describing how to extract Terminology we can just point to the ITS Terminology annotation specified in the ITS Module section.

I hope this helps.

Cheers,
-yves

Title: ITS Module Mock-up

5.9. ITS Module

5.9.1. Introduction

This section defines how the data categories of the Internationalization Tag set 2.0 [ITS] are represented in XLIFF.

For guidelines on how to extract original data annotated with ITS (e.g. an HTML5 or an XML file) please, see the appendix [B. Guidelines for Extraction with ITS].

ITS 2.0 is composed of 19 data categories, they are represented different ways:

  • Using markup already defined in the Core or other Modules
  • Using markup defined in the ITS Module's namespace.
  • Using a mix of both existing markup and markup of the ITS module.
  • Some data categories are not represented at all in XLIFF at this time.

The namespace for the ITS module is: urn:oasis:names:tc:xliff:itsm:2.1. and the recommended prefix is itsm.

The fragment identification prefix for the ITS module is: itsm.

The semantics of the attributes are analogical to their counterparts in the W3C ITS namespace in case those counterparts exist. The main semantic difference between its and itsm attributes is that itsm attributes can apply on non-wellformed spans that are delimited by empty boundary markers <sm/>/<em/>.

The elements and attributes defined in the ITS module are equivalent to their counterparts in the W3C ITS namespace when these counterparts exist. They use the same names and values. they also have the same semantics, with the addition that the ITS module attributes can apply on non-wellformed spans delimited by the empty boundary markers <sm/> and <em/>.

5.9.2 Annotators Reference

ITS 2.0 provides a [tools annotation mechanism]. It identifies the processor that generates ITS information. This information is mandatory for the [MT Confidence] data category, as well as for [Terminology] and [Text Analysis] if they provide confidence information. It is optional for other data categories.

In XLIFF the tool annotation is represented using the itsm:annotatorsRef attribute. The attribute is allowed on the <xliff>, <file>, <group>, <unit>, <mrk> and <sm/> elements. Its values and semantics are the same as its:annotatorsRef (with the <sm/> addition).

5.9.3 Data Category Representation

5.9.3.1 Translate

The [Translate data category] indicates whether a content is translatable or not.

It is represented with the [translate] attribute of the Core.

5.9.3.2 Localization Note

// Defines how localization note is represented in XLIFF

5.9.3.3 Terminology

The [Terminology data category] is used to denote terms and optionally associates them with information, such as definitions.

It is represented with the ITS Terminology annotation:

Usage:

  • The id attribute is REQUIRED.
  • The type attribute is REQUIRED and set:
    • either to itsm:term-no, which maps to its:term='no'.
    • or to term, which maps to its:term='yes'.
  • The value attribute is OPTIONAL and contains a short definition of the term.
  • The ref attribute is OPTIONAL and contains the same values as its:termInfoRef.
  • The translate attribute is OPTIONAL.
  • The itsm:termConfidence is OPTIONAL and contains the same values as its:termConfidence.
  • The itsm:annotatorsRef is OPTIONAL and contains the same value as its:annotatorsRef.

Constraints:

If the annotation has an itsm:termConfidence attribute, it must be within the scope of an itsm:annotatorsRef with the terminology annotator set.

Example:

<unit id='1' its:annotatorsRef='terminology|http://www.cngl.ie/termchecker'>
<segment>
<source>Text with a <pc id='1'><mrk id='m1' type='term'
itsm:termInfoRef='http://en.wikipedia.org/wiki/Terminology'
itsm:termConfidence='0.9'>term</mrk></pc>.</source>
</segment>
</unit>

5.9.3.N Etc...

// Etc...

5.9.3 Processing XLIFF with ITS processors

// Describes how it the content should be transformed. And provides the rules file.

// Felix has drafted text for this in the wiki

// ...

Appendix B. Extracting Data with ITS (Informative)

B.1 Translate

B.2 Localization Note

B.3 Terminology

Etc...



By Date: Previous | Next Current Thread By Thread: Previous | Next