That was my interpretation of the uniqueness of IDs for the inline codes.
(Although Okapi doesn’t seem to be catching those duplication cases)
The problem is that it brings a rather big issue: technically we can’t represent exact matches.
If no content in an mtc:match can have its inline codes with the same ID as in the source, they cannot ever have the same content. Furthermore, it’s impossible to be sure how to map correctly a content when applying a match: since the IDs should be different by this interpretation, you can’t know which one corresponds to which one when leveraging.
The same problem will exist for the Change Tracking module.
It seems to be something that need a resolution soon.
From: David Filip [mailto:email@example.com]
Sent: Monday, December 21, 2015 7:44 AM
To: Soroush.Saadatfar <Soroush.Saadatfar@ul.ie>
Cc: Yves Savourel <firstname.lastname@example.org>; email@example.com
Subject: Re: [xliff] XLIFF and ITS mapping-ID value
On the duplication of span id's in the candidates module, I agree with the NVDL interpretation.
Although nested within mtc,:the core span id's are within the same unit, so should not be allowed to duplicate ids of not nested core spans.
Dr. David Filip
OASIS XLIFF OMOS TC Chair
OASIS XLIFF TC Secretary, Editor, Liaison Officer
KDEG, Trinity College Dublin
On Mon, Dec 21, 2015 at 12:41 PM, Soroush.Saadatfar <Soroush.Saadatfar@ul.ie> wrote:
The validator raises three errors;
1- Missing element with id="m1" to which the <mtc:match ... ref="#m1"> element is pointing (not in scope of this conversation);
2 & 3- ID duplication for both <pc> in <mtc:match>.
Seems like a tricky case, but I think it would make sense to treat <source>/<target> pairs in translate candidates along with those in <segment> or <ignorable> within the same <unit>. As <source> and <target> are both core elements even inside a module, according to the Spec inline elements inside them must have unique ids. Therefore even if the TC decides otherwise the text of Spec then should be modified.
Using the NVDL's logic of breaking XML document by namespaces, <source> and <target>, children of <mtc:match>, will remain in the context of <unit> after omission of "mtc:" namespace.
P.S. I assume my validator might need some editing for this case anyway, thanks!
Hi Soroush, all,
On a related topic of IDs in XLIFF.
Could you give me your opinion on the test file I posted here:
The question is: Is it valid to have the same id="1" ID in the two <pc> elements present in that file.
What is your validator results for that file?
Dear Felix, Yves,
Thanks for your help, your notes will be applied.
HI Soroush and all,
The spec text for ID value data category of ITS is offered as follows:
This data category provides a mechanism to build customized unique identifiers for different parts of the document. It is generally recommended to use native xml:id or id attributes for XML and HTML documents respectively.
As XLIFF 2.1 identifiers are not unique in the scope of the entire document, this data category should be avoided where possible and replaced by native XLIFF identifiers."
And I have a question with regards to the example. The current text in the wiki page mentions "exceptions for very specific context...", would you please provide a particular case? I don't think the current example on wiki is that.
There is no example because as the wiki says: „Note that the identifiers in XLIFF are not unique per document, so using the Id Value data category to specify IDs in an XLIFF document is largely useless“. So I would leave this as as, without an example about such contexts.
Sent: 19 December 2015 12:38
Subject: RE: [xliff] XLIFF and ITS mapping- Elements within text
Several of the examples have "curly quotes" instead of normal ASCII double quotes to enclose attribute values.
A side effect of editing in an email probably, but they need to be correct in the spec.
<source>This sentence has a breakpoint<ph id="ph1" dataRef="d1"
The example above has a dangling "-->" after </originalData>.
<source>A paragraph where <sc id="sc1" dataRef="d1" type="fmt"
subType="xlf:u"/>the formatted text takes more than one
<source> The second sentence here.<ec dataRef="d2"
In the example above, the <ec/> element seems to be missing the attributes type="fmt" and subType="xlf:u".
Now, the Okapi validator gives an error on this, but I don't see in the spec any constraint that says type and subType must have the
same values in two corresponding <sc/> and <ec/>. That constraint exists only for canCopy, canDelete and canOverlap.
I don't see anywhere either that the processor should magically complement the undefined attributes in <ec/> by looking at the
corresponding <sc/>. At the same time, obviously, it would be illogical to have different values.
Are we missing yet another "implicit" constraint?
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail. Follow this link to all your TCs in OASIS at: