OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: XMLness aspect (RE: [xliff] Opinion: importance of preservingXML markup and nodes)


Woops - to add to my already confusing way of talking, I referred to Yves' <seg> aspect, when I should have said <sub>. I fixed the note below (replacing <seg> with [<sub>])

-----Original Message-----
From: bryan.s.schnabel@tektronix.com [mailto:bryan.s.schnabel@tektronix.com]
Sent: Thursday, April 02, 2009 12:57 PM
To: ysavourel@translate.com; xliff@lists.oasis-open.org
Cc: arle@lisa.org
Subject: [xliff] XMLness aspect (RE: [xliff] Opinion: importance of preserving XML markup and nodes)


Hi Yves,

I will address the XMLness aspect of your reply. And I will leave the necessity of [<sub>] aspect to others who are smarter than me about that part of the spec.

I'm happy that you support preserving XMLness.  I guess that was my primary issue.  I feel we should present that as the primary best practice to support nearly all XML markup.  Part of my discomfort with the TMX proposal is that not only does it not rank this approach as the first best practice, it fails to offer it as an example at all.

And I acknowledge your point about needing to accommodate the wiki point 3.3 (http://wiki.oasis-open.org/xliff/OneContentModel/Requirements?action=show&redirect=Requirements#head-0162491596e263001714e9d13e951603df5ac8a4):

Text in <b>bold. This too</b>.

XLIFF 1.2
seg1 = Text in <bx id='1'/>bold.
seg2 = This too<ex id='1'/>.

TMX 2.0
seg1 = Text in <itag x='1' pos='start' type='bold'/>bold.
seg2 = This too<itag x='1' pos='end' type='bold'/>.

And I also begrudgingly acknowledge your second point about needing to accommodate the wiki point 3.4 (http://wiki.oasis-open.org/xliff/OneContentModel/Requirements?action=show&redirect=Requirements#head-f473da5eae7f79e9ae89bc739831d3ad8010c8f7):

Text in <startB/>bold and in <startI/>italic<endB/>.<endI/>

Even though we see this less and less (thankfully), and even though you (or maybe I) could make the point that this yucky example should be fixed at the source, I agree that in the real world it we must offer a work-around to handle yucky-ness.

But here is my plea.  Let's try to rank our suggested best practices. Let's say that the best scenario is that you have the good fortune to have well-formed XML as your source, and the best practice for handling that is to process your TMX and XLIFF in a responsible XML way:

Primary Best Practice
> <p>
>   <b>XML</b> is a general-purpose  <i>specification</i>
>   for creating custom markup languages.
> </p>

> <seg>
>   <itag type="b" x="1">XML</itag> is a general-purpose
>   <itag type="i" x="2">specification</itag>
>   for creating custom markup languages.
> </seg>

But if your segmentation spans to units, and divides an inline element, do it this way:

Secondary Best Practice

> <p>
>   <b>XML</b> is a general-purpose  <i>specification. Use XML</i>
>   for creating custom markup languages.
> </p>

> <seg>
>   <itag type="b" x="1">XML</itag>
>   is a general-purpose
>   <itag pos="start" x="2" type="i" />specification.
> </seg>
> <seg>
>   Use XML<itag pos="end" x="2" />
>   for creating custom markup languages.
> </seg>

And if you have to deal with overlapping, or malformed tags, use this work around . . .

I did not address your very interesting idea of "There is another solution that would use the same element for both cases, but differently" because I think it pertains more to the [<sub>] part of your reply.

Thanks,

Bryan


-----Original Message-----
From: Yves Savourel [mailto:ysavourel@translate.com]
Sent: Wednesday, April 01, 2009 6:03 PM
To: Schnabel, Bryan S; xliff@lists.oasis-open.org
Cc: arle@lisa.org
Subject: RE: [xliff] Opinion: importance of preserving XML markup and nodes


Hi Bryan, all,

> I really think each of them is bad.
> I think the following is a much better way (and, at least on
> the XLIFF side, I will advocate strongly for this):
> <seg>
>  <itag type="b" x="1">XML</itag> is a general-purpose
>  <itag type="i" x="1">specification</itag>
>  for creating custom markup languages.
> </seg>

I would tend to agree that preserving the XMLness is important.

However, there are cases where this cannot be done for various reasons: For example a segment break separating an opening tag from a
closing tag, or some format where codes are denoted by starting and ending marks possibly overlapping rather than by balanced codes.
This is illustrated by requirements 3.3 and 3.4 here:

http://wiki.oasis-open.org/xliff/OneContentModel/Requirements?action=show&redirect=Requirements#head-0162491596e263001714e9d13e95160
3df5ac8a4

http://wiki.oasis-open.org/xliff/OneContentModel/Requirements?action=show&redirect=Requirements#head-f473da5eae7f79e9ae89bc739831d3a
d8010c8f7

So, I think we do need a way to cater for those cases. And it just cannot be done using a structure that is XML-friendly.

I think Rodolfo's current <itag> is a unique solution that can handle both cases with one construct. This simplify the set of tags
to use by reducing the nice case (balanced pair) as a specific case of the nasty one (broken pair).

There is another solution that would use the same element for both cases, but differently. But it requires the code to be either not
stored, or stored in an attribute of <itag> rather than as the content of <itag>. If you do this, then the content of <itag> becomes
free to be used to represent the balanced-pair case, and with an empty content it can be used to represent the broken-pair case.

"[startB]Bold.[endB]"

<seg><itag x='1' bc='[startB]' ec='[endB]'>Bold.</itag></seg>

<seg><itag x='1' bc='[startB]/>Bold</seg>
<seg><itag x='2' ec='[endB]'/>.<seg>

Storing the code in an attribute has one important side effect: You cannot use <sub> anymore, because you cannot put a <sub> element
inside an attribute.

Which leads to my earlier question about whether <sub> was that important?

...and I now realize that the email apparently never made it to the mailing list... I guess I'll have to re-write it. But the bottom
line was: do we need <sub>? I think Arle had some concern about removing it because some people at OSCAR were considering it
important. Could we have some of their technical reasons for keeping it?

Cheers,
-yves








---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]