OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: [OASIS Issue Tracker] (XLIFF-9) Use (also) the normal ITS namespace in the ITSM module

    [ https://issues.oasis-open.org/browse/XLIFF-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=64284#comment-64284 ] 

Yves Savourel commented on XLIFF-9:

> ... we would be changing the definition that core only agents use to 
> identify protected XLIFF-defined namespaces. 
> This is arguably core impact that we promised not to do in dot releases.

I don't believe it would change the core definition: The ITS namespace is not XLIFF-defined. It's just that part of the ITS support would be done through a non-XLIFF namespace. User agents still should preserve the data.

It wouldn't be a first: We use xml:lang attributes for example in the Resource Data module. Those are not XLIFF-defined and therefore not subject to the PR for XLIFF-defined namespaces.

> @Yves, I think that you can't be serious saying that people
> can delete module data anyways.

I'm not sure what you mean by "anyways". I was simply pointing out that user agents can remove XLIFF-defined attributes and elements if they know what they are doing.

The PR says: "An Agent processing a valid XLIFF Document that contains XLIFF-defined elements and attributes that it cannot handle MUST preserve those elements and attributes."

This PR does not prevent a user agent to remove a module's constructs if it knowns how to do it properly. The same goes for the PR protecting extensions.

What I was trying to say is that, in my opinion, it would OK to have the ITS data protected only with a 'should preserve' because the difference between 'should preserve if you cannot handle it' and 'must preserve if you cannot handle it' is moot if you do know how to handle it.

> B) ensure that the categories apply correctly to pseudospans
> formed by atomic markers

The <sm> element was the main reason why we decided to use ITSM.

But this was before we came to the conclusion that <sm> cannot be really processed properly with ITS processors without dedicated modifications. It was also (at least for me) before realizing several data categories would not be mapped by ITS rules because they do not have pointers.

Let's look at it from an implementation viewpoint and see how this works for both an XLIFF processor and an ITS processor with the two potential namespaces:

1) XLIFF processor + ITSM namespace + <mrk> ==> OK

2) XLIFF processor + ITSM namespace + <sm/> ==> OK

3) XLIFF processor + ITS namespace + <mrk> ==> OK

4) XLIFF processor + ITS namespace + <sm/> ==> Not OK
   (Different semantics)

5) ITS processor + ITSM namespace + <mrk> ==> Not OK
   (No pointers for ITS rules in several cases)

6) ITS processor + ITSM namespace + <sm/> ==> OK

7) ITS processor + ITS namespace + <mrk> ==> OK

8) ITS processor + ITS namespace + <sm/> ==> Not OK
   (No support for pseudo-spans in ITS)

So we have 3 case with problem:

4) XLIFF processor + ITS namespace + <sm/> ==> Not OK
   (Different semantics)

5) ITS processor + ITSM namespace + <mrk> ==> Not OK
   (No pointers for ITS rules in several cases)

8) ITS processor + ITS namespace + <sm/> ==> Not OK
   (No support for pseudo-spans in ITS)

How can we resolves them?

- We tried and came to the conclusion that #8 was not really solvable for ITS processors without dedicated solutions. So that option would be a lost case.

- We cannot really resolve #5: Creating ITS extensions for adding pointers to several data categories would be a lot of work and would require ITS processors to be modified. Lost case too.

- For #4: the original semantics technically do not work there. But we could include a wording in the specification saying that ITS in <sm> is to be treated like other XLIFF attribute in <sm>. The XLIFF processors do already have the mechanism in place for handling this. They just need to avoid doing things differently for ITS attributes.
Is this making the same ITS attributes behaving different ways depending on if it's processed with an XLIFF or an ITS processor: technically yes. But remember than ITS processors don't work with <sm> anyway (#8).

Why don't we do this with xml:lang (instead of using itsm:lang)? Well, xml:lang is more than ITS, it's implemented by low-level XML parsers. We cannot do that for xml:lang. It seems more acceptable for ITS-only attributes because it applies only to XLIFF processors, something for which we can define the behavior.

So, to go back to David's comments. I think issue A is not really an issue. And issue B can be addressed.

The alternative is that we cannot really specify several ITS data categories in XLIFF.

> Use (also) the normal ITS namespace in the ITSM module
> ------------------------------------------------------
>                 Key: XLIFF-9
>                 URL: https://issues.oasis-open.org/browse/XLIFF-9
>             Project: OASIS XML Localisation Interchange File Format (XLIFF) TC
>          Issue Type: New Feature
>          Components: ITS Module
>    Affects Versions: 2.1_csprd01
>         Environment: https://lists.oasis-open.org/archives/xliff-comment/201610/msg00020.html
>            Reporter: Yves Savourel
>            Assignee: Felix Sasaki
>              Labels: request_tc_discussion
>             Fix For: 2.1_csprd02
> Looking at the rules files for ITSM, we can see there are several data categories that cannot be mapped by rules because they do not
> have pointer attributes available.
> - Localization Quality Issues
> - Localization Quality Rating
> - Provenance
> - MT Confidence
> This means a pure ITS processor cannot process an XLIFF document and get any data for those data categories.
> This would be resolved if the namespace was ITS' rather than the ITSM (ITS Module) namespace.
> I believe we selected early on to go with ITSM even for the data categories defined from scratch because of the <sm/> case where the
> semantics need to be adjusted. Since, we establish (I think) that the <sm/> case is such that ITS processors cannot really resolve
> it anyway.
> In other words, the <sm/> case is hopeless if you are not an XLIFF processor, whether you use ITS or ITSM, and XLIFF processors do
> treat <sm/> in a special way in the case of ITSM. They could do the same for ITS.
> Hence, it seems all the data categories that ITSM implements 'from scratch' could be in the normal ITS namespace, and work for both
> XLIFF and ITS processors.

This message was sent by Atlassian JIRA

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]