OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Contsraints for start and end markers and codes was: Re: [xliff] RE: Resolution proposal/Call for Dissent Re: [xliff] mrk translate outside the content but in scope


Thanks Yves,

I agree that this does not have bearing on the solution of the translation state set by markers, this is why I changed the subject.
During the discussions in last days, we figured that the constraints on sc ec order and presence in the same unit are and should be to a certain extent analogical to those on sm and em, so this is why I included this in this solution proposal.

Reviewing the constraints set on sc and ec, I figured that these constraints do ensure the usage of id on orphaned codes (this is what I suggested as part of the solution and the proposed solution is not otherwise affected by this). They however do NOT enforce the usage of startRef on end codes that are not isolated.
I feel that this is a deficiency needs to be fixed.

I believe that while both id and startRef are OPTIONAL on ec (becuase of the schema expressivity limitations), the intent is that exactly one of them is REQUIRED to be set, 
Which one is governed by the value of isolated.

So I believe that the ec constraints should be changed like this

Current Constraints:

  • The attribute isolated MUST be set to yes when the <sc> element corresponding to this end code is not in the same <unit> and set to no otherwise.

  • When the attribute isolated is set to yes, the attribute id MUST be used instead of the attribute startRef.



To be replaced with these proposed constraints:

  • The attribute isolated MUST be set to yes when the <sc> element corresponding to this end code is not in the same <unit>.

[The attribute does not need to be *set* to "no" otherwise, because "no" is the default value. The same is true for the sc constraint.]

  • Exactly one of the attributes id or startRef MUST be specified:
    • The startRef attribute is REQUIRED if and only if the value of the attribute isolated is "no".
    • The id attribute is REQUIRED if and only if the value of isolated is "yes".


Rgds
dF

Dr. David Filip
=======================
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734


On Wed, Nov 6, 2013 at 2:17 PM, Yves Savourel <ysavourel@enlaso.com> wrote:
Hi David,

I haven’t looked at the whole proposal yet, but here is a first comment:

> 2.3. For <ec/> only, id is REQUIRED if the corresponding <sc/> does not exist
> or is outside of the enclosing <unit>

I'm not sure what this item would be needed with regard to the re-segmentation/translate issue. But in any case: It is already a
constraint in the <ec> definition: "When the attribute isolated is set to yes, the attribute id MUST be used instead of the
attribute startRef."
So there is no need to introduce anything new about this.

Cheers,
-ys


From: Dr. David Filip [mailto:David.Filip@ul.ie]
Sent: Wednesday, November 6, 2013 6:47 AM
To: Yves Savourel
Cc: xliff@lists.oasis-open.org
Subject: Resolution proposal/Call for Dissent Re: [xliff] mrk translate outside the content but in scope

Hi all, I 'd like to suggest a resolution for this one.

This is based on this thread the discussion that we had in the TC and the discussion among Fredrik, Yves, and myself that continued
after the meeting lost quorum.

Although the above described algorithm is unambiguous we figured that it can be simplified by dropping the translate attribute from
segment.

So the last structural default is set on unit and the <sm/>/<em/> overrides are more naturally understood if you do not need to
worry about segment defaults.

Please note that both type of markers (well formed <mrk></mrk> spans or <sm/>/<em/> pairs) will still inherit their translate
default from unit XOR its parent <mrk>. These defaults will be however overridden by the lowermost enclosing <sm/></em> pair that
has the value set explicitly. Same as in the above described algorithm, we are still looking for several implementation examples for
the algorithm, but these should be considerably simpler now.
Segmentation will have no impact on translatability because neither segments nor ignorables will carry translatability metadata.

There is another side effect: dropping of translate from segment will make the complex re-segmentation PRs less so, which is a good
thing. This also disposes of Yves comment complaining about dropping translatability info on resegmentation due to current
resegmentation PR wrt translate on segment.
https://lists.oasis-open.org/archives/xliff/201310/msg00079.html

*This is a relatively major change, still I will first try a call for dissent and will only propose a ballot in case dissent is
expressed by next Monday.*

The proposal:

1. Drop translate from segment (-> drop re-segmentation PRs re translate)
1. Marker defaults will be inherited from unit instead of segment
2. Add constraints for <sm>/</em> and <sc/>/<ec/>
1. For both, end marker or end code must come logically after its start marker or start code.
1. Warning: logical order in target can be affected by order attributes within the enclosing unit
2. For <sm/> and <em/> only, the corresponding marker MUST be within the same unit.
3. For <ec/> only, id is REQUIRED if the corresponding <sc/> does not exist or is outside of the enclosing <unit>
3. Make the note in the translation annotation section a standard part of the normative text and change "segment" to "unit" like
this:
Current text:
Note
This annotation overrides the translate attribute set at the <segment> level.

New text:
This annotation has precedence over the translate attribute values inherited from the <unit> level 
End of the proposal

If this bunch of changes is agreed on, I can implement it in the spec and ask for review of the detailed implementation.

Algorithm implementation examples can be added later, even after the third review, because they are not normative. The examples
would be placed in the translation annotation section. I just do not want to tie these changes to fleshing out the algorithm
implementation examples for timing reasons..

Best regards and please engage in this thread by the end of this week
dF 


Dr. David Filip
=======================
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
http://www.cngl.ie/profile/?i=452
mailto: david.filip@ul.ie

On Tue, Nov 5, 2013 at 12:12 PM, Yves Savourel <ysavourel@enlaso.com> wrote:
Hi David, all,

> possibly adding a constraint enforcing the <em/>
> to come logically after their <sm/>.
Possibly. That may be the case for <sc> and <ec> too.


>> I put the question marks because it wasn't clear
>> to me what was the result of your algorithm. Now I know.
>
> Good, and is it the same as yours?
I didn't have a set expectation.


>>  - Set the bottom of the stack to the translate
>>    value of the part:
>>    - For a segment: the resolved translate value of
>       the segment
>>    - For an ignorable: the resolved translate value
>>      of the parent unit.
>
> I don't think so, ignorable is not translatabale no matter what,
> it can just hold pieces of annotations that can flip parts
> outside of itself.
When computing the state of translate in a segment one must take account both the segment and the ignorable elements into account as
part of a whole, otherwise you simply cannot end up with the correct result.

Note also that ignorable element contains whatever the Extractors or Modifiers deem appropriate (whitespace and inline codes most of
the time). While its content is not expected to be translatable text, it may need to be adapted in some cases, for example to remove
whitespace between segments in some languages, or to adjust whitespace when re-ordering the target.


>> Technically I don't think this is enforced by a constraint or a PR.
>
> I think we do not have this, I propose to add this to the <sm/>
> and <em/> descriptions.
> I think we should say for <sm/> that it MUST have  a closing <em/>
> within the enclosing unit (and vice versa)
> AND
> that <em/> must come logically after its corresponding <sm/>
I tend to agree.
But other really need to look at all this and have opinions.


> I'd also add a warning that the order attribute has to be considered
> in target for determining the logical order..
Possibly. But let's just make sure we don't add repeated warnings, notes, PRs all over the place. As a general guideline we should
be concise. In my experience, overwhelming the reader with repeated information often leads to confusion and make things look more
complex than they are. The Segmentation Modification section is an example of that. I'll get to that at some point.

Cheers,
-yves




---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php



---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]