OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [xliff] Revised Committee Draft XLIFF 1.2 Spec


Hi all,

 

Regarding point 5) below I would like to clarify that the segmentation markup is entirely optional and does not affect the functionality of the files as XLIFF files. The filter will ignore the segment boundaries when back-converting to the native format, and thus the entire content of the <target> element will be used, including content appearing outside of segments.

The segmentation markup is intended for tools such as translation memories to optimize recycling of previously translated content, and as such it often makes more sense to not include spaces between sentences in the segments (though that is entirely up to the tools). This does not in any way prevent users from editing the content between the segments, removing spaces or adding additional ones as needed. In fact it is fully allowed (and may frequently happen) that only part of the translated content in the <target> element is part of segments.

 

The example in point 5) attempts to explicitly show this by not including the space characters in the segments (which normally gives better re-use when applying a TM). Perhaps we need to clarify the text in the specification to make that more explicit?

 

Cheers,

Magnus

 


From: Tony Jewtushenko [mailto:tony.jewtushenko@productinnovator.com]
Sent: Tuesday, May 09, 2006 8:41 PM
To: xliff@lists.oasis-open.org
Subject: [xliff] Revised Committee Draft XLIFF 1.2 Spec

 

I've revised the spec to address the issues raised by Rudolfo,  although three issues remain open as they require additional input from the TC.  I've also corrected a couple of minor problems with the spec HTML in order to certify spec as HTML 4.01 compliant. And additionally, the spec includes a number of minor corrections to typos as was kindly reported by Asgeir Frimannsson.

 

Asgeir also flagged a minor issue with the XSD's that will require a revision,  so the spec distribution package will need to be reissued once fixed documents are produced.

 

Below are the original emails and actions taken.  I would appreciate feedback on the remaining open issues as we need to close these issues down by the end of the week in order to complete any required revisions before Tuesday's TC teleconference.

 

Regards,

Tony

 

-----Original Message-----
From: Rodolfo M. Raya [mailto:rmraya@heartsome.net]
Sent: 08 May 2006 23:38
To: Tony Jewtushenko
Cc: xliff@lists.oasis-open.org
Subject: Re: [xliff] Re: XLIFF 1.2 Committee Draft Spec

 

Hi All,

Some details that need attention:

TJ: DONE> 1) In section 3.2.5 the description of <mrk> element mentions the optional attribute equiv-text but the attribute is not included in the list of optional attributes for that element.

TJ: DONE> 2) The example in section 2.5.2 is supposed to illustrate how to use attributes from XHTML. The name space declared in the listing corresponds to HTML 4.0 (which is not XML). The URI used to identify XHTML namespace is "http://www.w3.org/1999/xhtml"

TJ: LEFT AS IS – requires more discussion > 3) <seg-source> is a new element born with old bad habits. It has an optional "ts" attribute that is deprecated. I don't see a reason for adding deprecated stuff to something new.

TJ: LEFT AS IS – requires more discussion > 4) <bin-target> has "restype" and "resname" attributes; <bin-source> doesn't. Those attributes are optional in <bin-unit> and I don't see a reason to override in the target something that is not required in the source. Keeping the attributes at <bin-unit> level should be enough, but if for some reason it isn't, the same reasons should be also valid for adding the attributes to<bin-source>

TJ: LEFT AS IS – requires more discussion> - 5) The first example in section 2.9 (Segmentation) is:

<source>Richard stepped out of the kitchen hut. He noticed a movement from the corner of his eye. A monkey had

climbed on top of one of the workshop sheds, trying to get in by the ventilation shaft.</source>

<seg-source>

 <mrk mtype="seg">Richard stepped out of the kitchen hut.</mrk>

 <mrk mtype="seg">He noticed a movement from the corner of his eye.</mrk>

 <mrk mtype="seg">A monkey had climbed on top of one of the workshop sheds, trying to get in by the ventilation

 shaft.</mrk>

</seg-source>


and the text below it says:

In the example above the space characters between the sentences are not included inside the <mrk> elements.

Those spaces are relevant at translation time. IMHO, they should be included in the <mrk> elements.

Given the new segmentation, CAT tools should offer the text enclosed in <mrk> elements to the translator instead of the content from <source>. If translators don't see the spaces, they can't correct them. Two special cases come to my mind:

  • When translating from English to Chinese, the spaces between sentences should be removed. If a filter removed those spaces automatically, the reverse conversion filter may try to add them back also automatically. The result will be wrong.
  • When translating to French, the spaces between sentences may need to be duplicated. In Canadian French or French from France (I don't remember in which one at this moment) it is considered good style to put two spaces between sentences. If an automatic filter adds only one, then the output may not look as nice as the translator spects.

FWIW, I've seen the problems described above happen in real life.

TJ: DONE>: 6) The last item in this list is a minor typographical error. In definition of <group> element there are two periods at the end of "within the same <file>.."

It's time to take a break now.

Best regards,
Rodolfo

 

 

-----Original Message-----
From: Asgeir Frimannsson [mailto:asgeirf@gmail.com]
Sent: 09 May 2006 07:49
To: Tony Jewtushenko
Subject: XLIFF spec spelling & links

 

html:

 

TJ: DONE> do a search for 'draft-xliff-core-1.2-20060427.html'. many (8) anchors

still point to this document.

 

Style/spelling etc:

 

TJ: DONE> 2.5.1. Adding Elements

(Second Paragraph):

Several non-XLIFF element can be used... (element->elements)

 

TJ: DONE> 2.6. Embedding XLIFF

If necessary an XLIFF document, or parts of a document can be embedded

within another XML document.  (comma missing somewhere?)

 

TJ: DONE> 2.9 Segmentation

(5th-ish paragraph)

This occures when either the start or end... (occures->occurs)

 

TJ: DONE> Details for <note>

pertains specifically to the to the <source> or the <target> element.

(to the to the)

 

TJ: DONE> Details for <target>

The restype attribute has been deprecated in XLIFF 1.2, since <target>

will allways be of the same restype as its parent <trans-unit> or

<alt-trans>. (allways -> always)

 

TJ: DONE> Details for <bin-unit>

Lists of values for the restype attribute is provided by this

specification. (Lists of values -> A list of values)

 

TJ: DONE> 3.2.4. Inline Elements

They enclose or replace any formatting or control codes that is not

text, (is -> are or codes->code)

 

TJ: DONE> Details for <bx/>

A list of values for the ctype attribute is provided by this

specification.The optional equiv-text attribute specifies text to

substitute in place of the inline tag. (space after full stop missing)

 

TJ: DONE> 3.2.5. Delimiter Element

This element is usually not generated by the extraction module and are

(and are -> and is)

 

TJ: DONE> Details for <mrk>

For example, to indicate to a Machine Translation tool proper names

that should not be translated; for terminology verification, to mark

suspect expressions after a grammar checking. (consistent use of

semi-colons? maybe replace comma after 'verfication')

 

TJ: DONE> 3.3. Attributes

Along with some of the attributes are the list of their possible

values. (are->is)

 

TJ: DONE> Details for equiv-text

It is useful for inserting whitespace or other content in place of

markup to facilitate consistant word counting. The equiv-text

attribute is also useful for ensuring consistant round trip (2x

consistant ->consistent)

 

TJ: DONE> Details for reformat

This value indicate (indicate->indicates) (2x)

 

TJ: DONE> C. Changes Since Previous Version (Non-Normative)

Revised version number from 1.1 tol 1.2. (tol -> to)

Clarified <group>, <trans-unit> and <bin-unit> to explicity specify id

attribute is unique. (explicity -> explicitly)

 

 

 

 

cheers,

asgeir

 

 

 

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This email message has been scanned by IronPort at SDL plc for the presence of computer viruses.
**********************************************************************


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]