Hi David, all,
>> If the module does not seem to have enough normative information,
>> I am open to hear what it should contain in your and TC opinion.
>
> I'll try to provide a mock-up, maybe it'll clearer.
Please see the attached file. It tries to show what I meant by having normative descriptions for each data categories in the ITS module and the information appendix describing the extraction guidelines.
I've used text from Felix's draft in the wiki (http://www.w3.org/International/its/wiki/XLIFF_2.0_Mapping#General_considerations_for_ITS_2.0_and_XLIFF) and added some.
The appendix would be very similar to what we have now, except it could be simplified a lot in some cases. For example, when describing how to extract Terminology we can just point to the ITS Terminology annotation specified in the ITS Module section.
I hope this helps.
Cheers,
-yves
Title: ITS Module Mock-up
5.9. ITS Module
5.9.1. Introduction
This section defines how the data categories of the Internationalization
Tag set 2.0 [ITS] are represented in XLIFF.
For guidelines on how to extract original data annotated with ITS (e.g.
an HTML5 or an XML file) please, see the appendix [B. Guidelines for
Extraction with ITS].
ITS 2.0 is composed of 19 data categories, they are represented different
ways:
- Using markup already defined in the Core or other Modules
- Using markup defined in the ITS Module's namespace.
- Using a mix of both existing markup and markup of the ITS module.
- Some data categories are not represented at all in XLIFF at this time.
The namespace for the ITS module is: urn:oasis:names:tc:xliff:itsm:2.1 .
and the recommended prefix is itsm .
The fragment identification prefix for the ITS module is: itsm .
The semantics of the attributes are analogical to their counterparts in
the W3C ITS namespace in case those counterparts exist. The main semantic
difference between its and itsm attributes is that itsm attributes can
apply on non-wellformed spans that are delimited by empty boundary markers
<sm/>/<em/>.
The elements and attributes defined in the ITS module are equivalent to
their counterparts in the W3C ITS namespace when these counterparts exist.
They use the same names and values. they also have the same semantics,
with the addition that the ITS module attributes can apply on
non-wellformed spans delimited by the empty boundary markers <sm/>
and <em/>.
5.9.2 Annotators Reference
ITS 2.0 provides a [tools annotation mechanism]. It identifies the
processor that generates ITS information. This information is mandatory
for the [MT Confidence] data category, as well as for [Terminology] and
[Text Analysis] if they provide confidence information. It is optional for
other data categories.
In XLIFF the tool annotation is represented using the itsm:annotatorsRef
attribute. The attribute is allowed on the <xliff>, <file>,
<group>, <unit>, <mrk> and <sm/> elements. Its
values and semantics are the same as its:annotatorsRef (with
the <sm/> addition).
5.9.3 Data Category Representation
5.9.3.1 Translate
The [Translate data category] indicates whether a content is translatable
or not.
It is represented with the [translate ] attribute of the
Core.
5.9.3.2 Localization Note
// Defines how localization note is represented in XLIFF
5.9.3.3 Terminology
The [Terminology data category] is used to denote terms and optionally
associates them with information, such as definitions.
It is represented with the ITS Terminology annotation:
Usage:
- The
id attribute is REQUIRED.
- The
type attribute is REQUIRED and set:
- either to
itsm:term-no , which maps to its:term='no' .
- or to
term , which maps to its:term='yes' .
- The
value attribute is OPTIONAL and contains a short
definition of the term.
- The
ref attribute is OPTIONAL and contains the same
values as its:termInfoRef .
- The
translate attribute is OPTIONAL.
- The
itsm:termConfidence is OPTIONAL and contains the
same values as its:termConfidence .
- The
itsm:annotatorsRef is OPTIONAL and contains the same
value as its:annotatorsRef .
Constraints:
If the annotation has an itsm:termConfidence attribute, it
must be within the scope of an itsm:annotatorsRef with the terminology
annotator set.
Example:
<unit id='1' its:annotatorsRef='terminology|http://www.cngl.ie/termchecker'> <segment> <source>Text with a <pc id='1'><mrk id='m1' type='term' itsm:termInfoRef='http://en.wikipedia.org/wiki/Terminology' itsm:termConfidence='0.9'>term</mrk></pc>.</source> </segment> </unit>
5.9.3.N Etc...
// Etc...
5.9.3 Processing XLIFF with ITS processors
// Describes how it the content should be transformed. And provides
the rules file.
// Felix has drafted text for this in the wiki
// ...
Appendix B. Extracting Data with ITS (Informative)
B.1 Translate
B.2 Localization Note
B.3 Terminology
Etc...
|