OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [QUAR] RE: [xliff] RE: Inline attributes and canCopy


Hum… It’s true that we do not define what ‘corresponding’ means, and there is actually no explicit PR or constraints saying which if any attributes should be the same when an inline code in source and one in target have the same id value.

And because there is no PR nor constraints currently Lynx does not check any 'correspondence'. So it let pass something like <pc id='1' canCopy='yes'> in the source and <pc id='1' canCopy='no'> in the target.

I think you are right Ryan: when a source and a target code have the same id a validator should check their attributes are the same, but it's probably more complex: E.g. dataRef values could be different but the pointed data should be the same, etc. Also, a source <pc> may be mapped to a target <sc>/<ec> and therefore some attributes may be different. So it's not a trivial task to validate this correspondence. And without PR/constraints it'll harder to get validators to do it the same way.

-ys


From: Ryan King [mailto:ryanki@microsoft.com] 
Sent: Friday, June 19, 2015 5:43 AM
To: Ryan King; Yves Savourel; 'Estreen, Fredrik'; xliff@lists.oasis-open.org
Subject: [QUAR] RE: [xliff] RE: Inline attributes and canCopy

Sorry, did it again! Fat fingers, small phone! So...I think my confusion is around the definition of "corresponding".    I was under the impression it meant if all attributes are the same between the tags, then the ids must also be the same. So validation would check all other attributes first, and if they were the same, enforce id to be the same. However, Yves' highlighted text below leads me to believe that "corresponding" means if the ids are the same, all other attributes must also be the same.  So validation would check ids first, and if they were the same, enforce that all other attributes were the same same. Is that correct?
________________________________________
From: Ryan King
Sent: ‎6/‎18/‎2015 8:39 PM
To: Yves Savourel; 'Estreen, Fredrik'; xliff@lists.oasis-open.org
Subject: RE: [xliff] RE: Inline attributes and canCopy
I think my confusion is around the definition of "corresponding".    I was under the impression it meant if all attributes are the same between the tags, then the ids must also be the same. So validation would check all other attributes first, and if they were the same, enforce id to be the same. However, Yves' highlighted text below leads me to believe that "corresponding" means if the ids are the same, all other attributes must also be the same. I'm
________________________________________
From: Yves Savourel
Sent: ‎6/‎18/‎2015 7:56 PM
To: Ryan King; 'Estreen, Fredrik'; xliff@lists.oasis-open.org
Subject: RE: [xliff] RE: Inline attributes and canCopy
Hi Ryan,
 
Here is my take. But others should check too.
 
One note: some attributes (e.g. dataRef* can have different values between source and target)
Valid
Source
Target
 
Yes

<pc Id=1 x>
OK (added target code)
Yes
<pc Id=1 x>

OK (if canDelete='yes' for <pc id=1>)
Yes
<pc Id=1 x>
<pc Id=1 x><pc Id=2 x>
OK (added target code)
No
<pc Id=1 x>
<pc Id=1 y>
Correct: This is not valid
No
<pc Id=1 x>
<pc Id=1 y><pc Id=2 x>
Correct: This is not valid because some attributes in <pc id=1> must be the same in source and target
Yes
<pc Id=1 x>
<pc Id=1 x><pc Id=2 y>
OK
No
<pc Id=1 x>
<pc Id=2 x>
No is wrong: This is valid. If canDelete='yes' for <pc id=1> then in the target you can delete it and add a new code that has the same attributes (except of id). But it will be seen as an added code. It is probably not something one wants to do, but it is difficult to prevent it.
Yes
<pc Id=1 x>
<pc Id=2 y>
OK (if canDelete='yes' for <pc id=1>)
No
<pc Id=1 x>
<pc Id=1 x><pc Id=1 x>
Correct: This is not valid
Yes
<pc Id=1 x>
<pc Id=1 x><pc Id=2 x>
OK
 
Cheers,
-yves
 
 
From: Ryan King [mailto:ryanki@microsoft.com] 
Sent: Friday, June 19, 2015 2:09 AM
To: Yves Savourel; Estreen, Fredrik (Fredrik.Estreen@lionbridge.com); 'xliff@lists.oasis-open.org'
Subject: RE: [xliff] RE: Inline attributes and canCopy
 
Hi Yves, 
 
Please tell me if I have the correct understanding then.
 
Valid
Source
Target
Yes

<pc Id=1 x>
Yes
<pc Id=1 x>

Yes
<pc Id=1 x>
<pc Id=1 x><pc Id=2 x>
No
<pc Id=1 x>
<pc Id=1 y>
No
<pc Id=1 x>
<pc Id=1 y><pc Id=2 x>
Yes
<pc Id=1 x>
<pc Id=1 x><pc Id=2 y>
No
<pc Id=1 x>
<pc Id=2 x>
Yes
<pc Id=1 x>
<pc Id=2 y>
No
<pc Id=1 x>
<pc Id=1 x><pc Id=1 x>
Yes
<pc Id=1 x>
<pc Id=1 x><pc Id=2 x>
 
x and y stand for all attributes other than id, where x and y are different values.
 
Thanks,
Ryan
 
From: Yves Savourel [mailto:ysavourel@enlaso.com] 
Sent: Tuesday, June 16, 2015 8:26 PM
To: Ryan King; 'Estreen, Fredrik'; xliff@lists.oasis-open.org
Subject: RE: [xliff] RE: Inline attributes and canCopy
 
Hi Ryan,
 
In you example (and in all cases really), you know the relationship between source and target inline codes by their id values: A code with id=’1’ in the source corresponds to the code with id=’1’ in the target.
 
See the id definition in the specification:
• When used in <segment>, <ignorable>, <mrk>, <sm>, <pc>, <sc>, <ec>, or <ph> elements: 
o The inline elements enclosed by a <target> element MUST use the duplicate id values of their corresponding inline elements enclosed within the sibling <source> element if and only if those corresponding elements exist.
o Except for the above exception, the value MUST be unique among all of the above within the enclosing <unit> element.
 
Also, I’m not sure I understand the following text in your old message:
 
“Bottom line seems to be that to satisfy both constraints, we just need to make sure arbitrarily that one of the ids in target match source each time we process and validate, since we won’t really know which was the original, because our tools can’t use copyOf, because we always carry <originalData>.”
 
The “arbitrarily” part seems wrong: You should always know which source code corresponds to which target code because they have identical ids. And copyOf is to use when a new code is introduced in the target and has no associated originalData, it that case copyOf points to an existing code for which the merger knows how to get the original data.
 
Cheers,
-yves
 
 
 
From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Ryan King
Sent: Wednesday, June 17, 2015 12:36 AM
To: Estreen, Fredrik (Fredrik.Estreen@lionbridge.com); 'xliff@lists.oasis-open.org'
Subject: [xliff] RE: Inline attributes and canCopy
 
Hi Frederik, all, 
 
I’m resurrecting this thread. It seems to me that the only way to tell if a target <pc> corresponds to source <pc> is to make sure their attributes, with the exception of id, are identical, then validate the constraint. However, as Frederik mentions below, this constraint is to help mergers when <originalData> is not present so that they know which <pc> tags correspond to which original codes (which may be stored outside of the xliff). BUT if I have:
 
xml
<p><b><i>text</i></b></p>
 
xlf 
<source><pc id=”1”><pc id=”2”>text</<pc></pc></source>
<target><pc id=”?”>text</pc></source>
 
With no other attributes in <pc> other than id, how do I know which id to match in source? Is the tag in target <b> or <i>? How can I apply the constraint without knowing the original data?
 
Thanks,
Ryan
 
 
From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Ryan King
Sent: Thursday, January 29, 2015 9:02 PM
To: Estreen, Fredrik; 'xliff@lists.oasis-open.org'
Subject: [xliff] RE: Inline attributes and canCopy
 
Thanks for the detailed explanation, Frederik! Somehow I missed the copyOf processing requirement. With your rationale, as long as we have <originalData>, we don’t really need ids for merge, which is our case. Additionally, we decode inline tags back to native codes when we store data in our TMS, so we perform normalization for matching, etc. in a non-XLIFF dependent way.
 
Bottom line seems to be that to satisfy both constraints, we just need to make sure arbitrarily that one of the ids in target match source each time we process and validate, since we won’t really know which was the original, because our tools can’t use copyOf, because we always carry <originalData>.
 
Thanks for the help,
Ryan
 
From: Estreen, Fredrik [mailto:Fredrik.Estreen@lionbridge.com] 
Sent: Thursday, January 29, 2015 6:28 PM
To: Ryan King; 'xliff@lists.oasis-open.org'
Subject: RE: Inline attributes and canCopy
 
Hi Ryan,
 
I interpret the specification the same way as you do with respect to the IDs and agree with your added sentence clarifying it.
 
The rule that a an inline element that represent the exact same element in both source and target use the same ID in both locations is there to facilitate merge to native format for agents that do not put native data in the XLIFF document. Without it an agent would not be able to detect reordering or addition of codes. Safe substitution of tags in matches, in system that allow that, also need this. And it and also enables the storage of inline elements in TMs without the actual native code. Storing the native code in the TM offers more options for validation and match transformation so it may be a good thing anyway.
 
“copyOf” is not really optional. It is required if the copied inline element does not have associated original data.
 
In “4.7.2.4.1 Duplicating an existing code”:
Processing Requirements
• Modifiers MUST NOT clone a code that has its canCopy attribute is set to no.
• The copyOf attribute MUST be used when, and only when, the base code has no associated original data.
This requirements makes sure that a merger can always know what an inline element in target means as long as it knows what the meaning of the inline codes in source is. The expectation is that a merger not storing original data would be able to learn the meaning of the source inline elements at merge time through some to XLIFF external method (database, original file, etc..)
 
If the inline elements have original data associated a comparison of that data will allow re-associating the copies with the originals at least to a degree needed by mergers.
 
Following the rules of inline IDs and  copyOf also allows more tag substitution to happen in matches. I personally believe that use of “copyOf” even for codes that have original data will allow a little bit more known safe substitutions than relying on comparison of original data. Unfortunately we don’t allow that behavior. The case where it makes a difference is if you have two identical inline codes in source and three in target. Which of the source ones is the third target one a copy of? But in most situations that will not be important to know, I have so far only found one TM related situation where it would help. Making it always required would also solve your use case.
 
The only XLIFF solution to the problem you present is the solution you include. Perform processing at modification time to make sure that the PRs and co-constraints are met. Or operate on a slightly modified model internally where you require the use of copyOf regardless of if the code has native data or not. The have an export / cleanup step that uses copyOf to make sure that one tag has the source ID and finally removes “copyOf” information that is in violation of the spec.
 
Regards,
Fredrik Estreen
From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Ryan King
Sent: den 30 januari 2015 09:18
To: 'xliff@lists.oasis-open.org'
Subject: [xliff] Inline attributes and canCopy
 
Hi TC, we’ve run into a dilemma and require some expert guidance. The XLIFF 2.0 spec says this about Id:
•         When used in <segment>, <ignorable>, <mrk>, <sm>, <pc>, <sc>, <ec>, or <ph> elements:
o    The inline elements enclosed by a <target> element MUST use the duplicate id values of their corresponding inline elements enclosed within the sibling <source> element if and only if those corresponding elements exist.
o    Except for the above exception, the value MUST be unique among all of the above within the enclosing <unit> element.
So, when an inline element is copied, I might get this example from the spec:
 
<unit id="1">
<segment>
  <source>Äter <pc id="1">katter möss</pc>?</source>
  <target>Do <pc id="1">cats</pc> eat <pc id="2" copyOf="1">mice</pc>?
  </target>
</segment>
</unit>
 
In order for this to meet the above constraint, the intended meaning is probably something like:
 
•         inline elements enclosed by a <target> element MUST use the duplicate id values of their corresponding inline elements enclosed within the sibling <source> element if and only if those corresponding elements exist. *Copies of inline elements are not considered to be corresponding to the original elements enclosed within the sibling <source> element and do not need to have the same id.*
 
And if that is true, when we validate the constraint to make sure the *original* source and target inline element ids match, how do we know which one is the original one if copyOf is not required and they can be reordered? If I just rely on making sure at least one of the Ids match regardless of position, what happens if I deleted the original elements in <target> and add back in two new ones? Do I have to make sure at least one of the elements has an id that matches? It seems like a lot of processing just satisfy the constraint.
 
What is the logical reason for this constraint?
 
Thanks,
Ryan 
 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]