xliff message

Subject: RE: XLIFF API in WebIDL

From: Yves Savourel <ysavourel@enlaso.com>
To: 'Ryan King' <ryanki@microsoft.com>, <xliff@lists.oasis-open.org>
Date: Tue, 24 Jun 2014 05:26:56 -0600

Hi all,

To continue on what Ryan was saying, one note I'd like to make is that the serialization may be a bit flexible:

In some cases we may have two options for the serialization: A first one that matches exactly the object model and that is can be
mapped seamlessly into the model. And a second one that is more object-independent but still easily parsed into any object model
close to the one we would have chosen.

An example is the content of <source> like this one:

<originalData>
 <data id='d1'>&lt;b></data>
 <data id='d2'>&lt;/b></data>
 <data id='d3'>&lt;br></data>
</originalData>
...
<source>Text in <pc id="1" dataRefStart="d1" dataRefEnd="d2">bold</pc> format.<ph id="2" dataRef="d3"/></source>


Imagine the object model we choose is some variation of the option c) I was mentioning before (string with special characters
pointing to the inline objects). That would give us something like the following JSON output, where we have a coded string and a
collection of objects to store the codes' data. (I've simplified the output by omitting any field that is set to its default):

{
   "src":{
      "text":"Text in\uE101\uE110bold\uE102\uE110 format.\uE103\uE110",
      "tags":[
         {
            "kind":"sc",
            "id":"1",
            "sdat":"<b>"
         },
         {
            "kind":"ec",
            "id":"1",
            "sdat":"<\/b>"
         },
         {
            "kind":"ph",
            "id":"2",
            "sdat":"<br>"
         }
      ]
   }
}

The almost same data can be represented using a very similar format, but abstract things a bit more, not assuming how the relation
between the text and the inline codes is implemented, but simply listing the parts:

{
   "src":[
      "text":"Text in",
      "tag":
         {
            "kind":"sc",
            "id":"1",
            "sdat":"<b>"
         },
      "text":"bold",
      "tag":
         {
            "kind":"ec",
            "id":"1",
            "sdat":"<\/b>"
         },
      "text":" format.",
      "tag":
         {
            "kind":"ph",
            "id":"2",
            "sdat":"<br>"
         }
      ]
   }
}

Such second representation could be easily fed into any object model for the <source>, and still be mapped without much trouble into
the version c). 

The point here, is that the serialization does not have necessarily to match exactly the OM and API. The advantage is that such
representation may be useable by more applications because it does not completely force a specific object.
The drawback is that it requires some minor coding for everyone, while the first representation has virtually no coding for the
applications implementing the OM/API but a bit more for everyone else.

In any case, as Ryan noted, serialization can be looked at last.

I'm still working on a basic description for the inline content model. Hopefully I'll post it within a couple of days.

Cheers,
-ys

Follow-Ups:
- RE: XLIFF API in WebIDL
  - From: "Schnabel, Bryan S" <bryan.s.schnabel@tektronix.com>

References:
- RE: XLIFF API in WebIDL
  - From: Yves Savourel <ysavourel@enlaso.com>
- RE: XLIFF API in WebIDL
  - From: Ryan King <ryanki@microsoft.com>