OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: 2.0 Binary Data Module Proposal


Thanks Fredrik for your suggestions on a binary module. Since Microsoft is both a content provider and a tool implementer dealing in huge amounts of various types of data, this module is very important to our business model. SharePoint’s implementation of supporting file-level binaries only scratches the surface of how it would be implemented. We want to take it to the next level in 2.0 so that we can provide all possible content in XLIFF to our suppliers and provide tools (and allow suppliers to provide tools) that will properly consume binary data, which a good portion of our content contains. Take the following source and target dialogs for example (also attached):

 

 

We need to carry all of the information needed to recreate the localized dialog, not just textual data. You’ll see here that not only two strings have been localized, but also the dialog size and two controls contained in the dialog have been localized (resized in this case): the label for “Please select a configuration…” and the drop-down box associated with it. Additionally, we might want to carry around a screenshot as reference for the translator. So, here is an example of how that XLIFF might look with a binary module:

 

<?xml version="1.0" encoding="UTF-8"?>

<xliff version="2.0" srcLAng="en-US" tgtLang="de-DE" xmlns:bin="urn:oasis:names:tc:xliff:binary:2.0"> 

  <file id="158" original="example.exe">

    <!-- external binary reference -->

    <bin:binary id="0" mime-type="image/jpeg">

      <bin:source href="" />

     <bin:target href="" />

    </bin:binary>

    <group id="158">

      <unit id="158" name="5" state="initial">

        <segment id="158">

          <source>Load Registry Config</source>

          <target>Load Registry Config</target>

        </segment>

       <!-- target text was not translated, but dialog size was increased -->

        <bin:binary id="158" state="translated" mime-type="windows-resource-dialog">

          <bin:source form="base64"><![CDATA[AQABAP//AAAAAAAAAADAAMiAAAAAAAAAugBEAAgAAAAAAAICAAAAAAACAgAAAAAAHAAAAE0AUwAgAFMAYQBuAHMAIABTAGUAcgBpAGYAAAAAAAAA]]></bin:source>

          <bin:target form="base64"><![CDATA[AQABAP//AAAAAAAAAADAAMiAAAAAAAAA0gBEAAgAAAAAAAICAAAAAAACAgAAAAAAHAAAAE0AUwAgAFMAYQBuAHMAIABTAGUAcgBpAGYAAAAAAAAA]]></bin:target>

        </bin:binary>

      </unit>

      <group id="1">

        <unit id="1" name="128;WIN_DLG_CTRL_" state="initial">

          <segment id="1">

            <source>OK</source>

            <target>OK</target>

          </segment>

         <!-- Neither target text nor control size were localized -->

          <bin:binary id="1" state="initial" mime-type="windows-resource-control-button">

             <bin:source form="base64"><![CDATA[AQAAAAAAAAAAAAEAAVBIAC8AMgAOAAIAAAAAAAABgAAAAA==]]></bin:source>

             <bin:target form="base64"><![CDATA[AQAAAAAAAAAAAAEAAVBIAC8AMgAOAAIAAAAAAAABgAAAAA==]]></bin:target>

          </bin:binary>

        </unit>

      </group>

      <group id="2">

        <unit id="2" name="128;WIN_DLG_CTRL_" state="translated">

         <segment id="2">

            <source>Cancel</source>

            <target>!!!Cancel!!!></target>

          </segment>

         <!-- target text was translated, but control size was not increased -->

          <bin:binary id="2" state="initial" mime-type="windows-resource-control-button">

            <bin:source form="base64"><![CDATA[AQAAAAAAAAAAAAAAAVCBAC8AMgAOAAMAAAAAAAABgAAAAA==]]></bin:source>

            <bin:target form="base64"><![CDATA[AQAAAAAAAAAAAAAAAVCBAC8AMgAOAAMAAAAAAAABgAAAAA==]]></bin:target>

          </bin:binary>

        </unit>

      </group>

      <group id="1060">

        <unit id="1060" name="130;WIN_DLG_CTRL_" state="translated">

          <segment id="1060">

            <source>Please select a configuration to load from the Registry</source>

            <target>!!!!!!Please select a configuration to load from the Registry:!!!!!!</target>

          </segment>

         <!-- both target text and control size were localized -->

          <bin:binary id="1060" state="translated" mime-type="windows-resource-control-static-text">

            <bin:source form="base64"><![CDATA[AQAAAAAAAAAAAAAAAlAHAAcArAAIAAAAAAAAAAABggAAAA==]]></bin:source>

            <bin:target form="base64"><![CDATA[AQAAAAAAAAAAAAAAAlAOAAcAxAAIAAAAAAAAAAABggAAAA==]]></bin:target>

          </bin:binary>

        </unit>

      </group>

      <group id="1021">

        <unit id="1021" name="133;WIN_DLG_CTRL_" state="initial">

          <segment id="1021">

            <source form="base64"><![CDATA[]]></source>

            <target form="base64"><![CDATA[]]></target>

          </segment>

          <!-- target text was not translated, but control size was increased -->

          <bin:binary id="1021" state="translated" mime-type="windows-resource-control-combo-box">

            <bin:source form="base64"><![CDATA[AQAAAAAAAAAAAAMBIVAJABkAqgCfAAEAAAAAAAABhQAAAA==]]></bin:source>

            <bin:target form="base64"><![CDATA[AQAAAAAAAAAAAAMBIVAJABkAxACfAAEAAAAAAAABhQAAAA==]]></bin:target>

          </bin:binary>

        </unit>

      </group>

    </group>

  </file>

</xliff>

 

One thing in particular to note: Because dialog controls are often hierarchical, representing them as such in XLIFF would be important so the above example shows a <group> containing both <unit> and <group> elements as siblings. Top-level group <group id=”158”> contains a unit <unit id=”158”> which contains the dialog title and binary data, but <group id=”158”> also other groups, which contain the dialog control text and binary data. We’re not 100% sure, but we believe it would be a change to the spec to allow both a <unit> and a <group> to be defined as siblings under another group.

 

Thanks,

Ryan

 

From: Estreen, Fredrik [mailto:Fredrik.Estreen@lionbridge.com]
Sent: Thursday, November 29, 2012 4:32 AM
To: Ryan King; xliff@lists.oasis-open.org
Subject: RE: 2.0 Binary Data Module Proposal

 

Hi Ryan, All,

 

The ballot for this as a feature was already passed, but I’d still like to make some comments and proposals on the implementation.

 

I personally do not believe that binary data in XLIFF is a good idea, but I do respect the decision of the majority. My concern is that this reverse the expectations that I feel is core to the XLIFF spirit. In my opinion the core idea is that XLIFF should enable tools from multiple creators to facilitate translation of content regardless of what tool was used to create the XLIFF file or what format the actual source document has. This is, at its most basic level, achieved by an initial tool that understand the source format transforming it or extracting translatable content into an XLIFF file. The file can then be further processed (usually translated) by other tools independent of the initial tool and source format. At the end of the processing chain the file is returned to the initial tool (or closely related tool) which create a localized version of the source file. By storing the source file in binary format within the XLIFF document the model is turned around. Now you have an initial tool that has no knowledge of the source format and depend on source format knowledge in the processing chain to get any meaningful work done. This use case would be better served by a translation package format leaving XLIFF to the actual translation of text.

 

Regarding the concrete proposal I have a few ideas on how to improve it. Binary data will in most cases not be suitable for direct processing by translators, instead it will need a separate extraction step before translation. So I think it would be good to simplify the task of leaving the binary portions out of the file for parts of the processing.

 

If the <bin-unit> is changed to a <bin-file> and made a sibling of <file> it would be a long step in that direction. Different <file>s in an XLIFF document are largely independent and merging and splitting documents at this level is common. Having the binary data as units possibly mixed into the sequence of text units would probably make it ambiguous if they can safely be removed and the content still be valid. In addition an empty <bin-unit> or other placeholder would have to be left so that the content can easily be re-inserted in the right place. It would also keep the binary data out of the path of other modules such as validation and string length restrictions.

 

Storing units smaller than files as binary data would make interoperability even harder so I do not think adding <bin-file> in addition to <bin-unit> would be a good idea. I’d prefer just <bin-file>. If the textual / markup portion of the data refer to external binary data some form of reference mechanism might be useful. But I do not see this as a requirement. If it is added I think the option of having the reference point form the <bin-file> to the <unit> it is associate with instead of the other way around would be good. This means that tools that do not make use of binaries would never encounter the reference directly.

 

Besides a mime type I think an original filename would be very helpful as the file extension is still the most common way to differentiate between formats when dealing with files. And I anticipate that most implementations would save the contents of the binary node into a file and process it in one or more separate steps. No application will or even could have a complete mapping of all mime-types to extensions.

 

Regards,

Fredrik Estreen

 

From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Ryan King
Sent: den 16 november 2012 01:02
To: xliff@lists.oasis-open.org
Subject: [xliff] 2.0 Binary Data Module Proposal

 

In anticipation of closing down on 2.0, we have two new proposals for modules. In this mail, we are proposing the second of the two, a Binary Data module.

 

For those who attended the XLIFF Symposium in Seattle, you were given the opportunity to see a *real world* implementation of the <bin-unit> element in SharePoint using XLIFF 1.2. Bryan Schnabel even advocated giving the SharePoint team an award J. Since there is no equivalent of the <bin-unit> in 2.0, this proposal is to add a Binary Data Module.

 

We think that the 2.0 implementation could be essentially the same as 1.2 with just the elements and attributes used *for now* so the we essentially get it on the 2.0 radar. We may want to propose additional features after we conduct some reviews with the SharePoint team over the next couple of weeks to get their feedback on any improvements they would like to see.

 

Here is SharePoint’s 1.2 implementation:

 

      <bin-unit id="fab82e10-02f0-4325-8390-bb10ec086bcc" mime-type="text/plain">

        <bin-source>

          <external-file href="" href="http://sphvm-33449/sites/pub/Translation%20Packages/redmond_makoscum/fr-fr-Documents-20121115T0733550000Z-0/fr-fr-Documents-0002.txt">http://sphvm-33449/sites/pub/Translation Packages/redmond_makoscum/fr-fr-Documents-20121115T0733550000Z-0/fr-fr-Documents-0002.txt" />

        </bin-source>

        <bin-target>

          <external-file href="" href="http://sphvm-33449/sites/pub/Translation%20Packages/redmond_makoscum/fr-fr-Documents-20121115T0733550000Z-0/fr-fr-Documents-0002.txt">http://sphvm-33449/sites/pub/Translation Packages/redmond_makoscum/fr-fr-Documents-20121115T0733550000Z-0/fr-fr-Documents-0002.txt" />

        </bin-target>

      </bin-unit>

 

 

      <bin-unit id="fab82e10-02f0-4325-8390-bb10ec086bcc" mime-type="text/plain">

        <bin-source>

          <internal-file form="base64"><![CDATA[VGhpcyBpcyBhIHRlc3Qu]]></internal-file>

        </bin-source>

        <bin-target>

          <internal-file form="base64"><![CDATA[VGhpcyBpcyBhIHRlc3Qu]]></internal-file>

        </bin-target>

      </bin-unit>

 

Please let us know your opinions on the proposal.

 

Thanks,

Microsoft Corporation

(Ryan King, Kevin O'Donnell, Uwe Stahlschmidt, Alan Michael)

 

 

Attachment: dialogs.png
Description: dialogs.png



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]