OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [xliff] Match type and subType

Hi Ryan, Shirley, all,

Here are a few comments:

=== I think having a subType for match is fine. and it can work, as you noted, in a similar way as the type/subtype for the inline codes.

=== for the values of type.
Currently the specification lists:

- am Assembled Match
- ebm Example-based Machine Translation
- idm ID-based Match
- ice In-Context Exact Match
- mt Machine Translation
- tm Translation Memory Match

a) In my opinion we should not have attributes that don't provide the same information several times. So 'in-context exact' is not right because it states it is an exact match and that information is already carried with similarity.

b) 'Example-based Machine Translation' is also wrong in this list because the list doesn't specify what kind of MT the match is coming from. Not only that information is probably not very useful, but we would then also need to define other types of MT (rule based, statistical, hybrid, etc.)

The definition ("Indicates the type of a <match> element.") is not very specific. So I would propose the following for the type attribute:

type - value providing additional information about how the match was generated or qualifying further the relevance of the match. The list of pre-defined values is general and user-specific information can be added using the subtype attribute.

Possible values:

- am - assembled match: Match generated by assembling different translation parts together.
- mt - machine translation candidate: Match generated by a machine translation system.
- icm - in context match: Match for which the context is the same as the context of the source content. For example the source text for both contents is also preceded by an identical source segment.
- idm - identifier-based match: Match that has an identifier identical to the source content. For example the previous translation of a given UI component with the same ID.
- tb - term base match: Match obtained from a terminological data base.
- tm - simple translation memory: Match obtained from a translation memory.
- other - other type of match: Type of match not covered by any of the other top-level types.

One can further specify the type of a match using the subType attribute.

(and the definitions can be improved I'm sure).

=== As for subtype.

I'm not sure we should define any default values (But we should certainly reserve the 'xlf' prefix, for possible future values).

I think defining an 'ice' match for subtype is not useful: it's redundant with similarity='100' + type='icm', and it also opens the door to have tools setting one attribute and not the others. So basically requiring extra processing requirements in order to insure duplication of information. The same goes for 'exact', 'fuzzy', etc.

If an authority wants to define sub-types such as 'fuzzy', 'near', 'exact' and 'ice'. that's fine (it may correspond to different type of payment for examples), but in my opinion it's a user-defined information.

so we would have something like this:

subType - Indicates the secondary level type for a match.

Value description:

The value is composed of a prefix and a sub-value separated by a character : (U+003A).
The prefix is a string uniquely identifying a collection of values for a specific authority. The sub-value
is any string value defined by an authority.

The prefix xlf is reserved for this specification, but no sub-values are defined for it at this time.

Other prefixes and sub-values may be defined by the users.

Default value: Undefined

Used in: <match>

Processing Requirements

• If the attribute subType is used, the attribute type MUST be specified as well.
• If the attribute type is modified, the attribute subType MUST be updated or deleted.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]