OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [xliff] XLIFF 1.0 issues - Binary elements & <internal-file>


Hi Mark,

From the XLIFF 1.0 Specification, in the Value Description of the
attribute 'form', "The value can be either text (for plain text data),
base64 (for data coded in base64 format), or one of values available
from the [RFC 1341] document: the MIME specification." Thus, we already
allow for those that want to use Base64, we just don't restrict it to
only Base64.

The purpose behind an 'original-format' attribute is to handle
instances where a conversion is made from multiple possible original
formats to Base64. Although this is round-tripping, one may not always
know the target  encoding.

As for the size issue: This has been brought up to me in the past as a
problem with XLIFF. One proposal that I've been considering is a
shortening of the names of elements and attributes, sort of an
XLIFF-Lite. However, I haven't had the chance to put anything together.
I have been thinking of mentioning this in the naming issue brought up
by Christian. That, of course is off this subject. However, size is an
issue we need to consider.

cheers,
john

>>> "Mark Levins" <mark_levins@ie.ibm.com> 4/18/02 7:44:38 AM >>>
Note to group, I'm breaking my original email into its constituent
parts 
in a bid to simplify the ongoing discussion. As a suggestion for the 
future, we should maybe limit proposals/suggestions for changes to the
1.0 
specification to one per e-mail, lessening the risk of some points
being 
lost and making any future threads easier to follow.

Hi John, Stephen

The CDATA approach fails here since binary data may easily contain the

closing characters for a CDATA section within its content thereby 
invalidating the XLIFF (and XML) document.
Perhaps, the attributes for the 'form' attribute should be defined and

fixed by the specification where one of the options would be 'Base64".
I don't really see why an 'original-format' attribute would be needed
in 
the event of an encoding, encoding and decoding would be a round-trip
with 
no option to decode to a different format.

In answer to Stephen's point on the growing size of an XLIFF document
if 
an encoding is used, did we not decide the file size was not really an

issue, being the reason that we are using sensible attribute and
element 
names rather than some obfuscated shorter approach?

Regards,
Mark

From Stephen Holmes:

On point 3 - bear in mind that localisatin/language tools that aspire
to
be network-based will find base64 encoded content to be monumentally
large to transfer.  Europe, remember, is still predominantly 56K and
we
all remember the hassle involved in FedEx'ing CD's to China - business
reality supercedes specification.

From John Reid:

<jr>How does CDATA fail this purpose? I wouldn't want to restrict this
to just Base64; thus, requiring a conversion for both the producer and
any subsequent processor that may be able to handle the original
format
without a problem. Additionally, wouldn't we need an attribute such as
'original-format' if we forced your conversion?</jr>


Original point:

3. Binary elements & <internal-file>
This is kind of a big one. At the moment the specification does not
define the form of the content of the <internal-file> element (although

there
is an optional 'form' attribute). The problem is see with this is that
the
specification allows users place binary data directly as content -
this
binary content may contain the reserved XML characters < > etc which
will cause parsers to choke.
The CDATA section approach is also not good enough to provide a
solution.
My suggestion is that the content of the <internal-file> be restricted
to Base64 or at least stated so.
Also, the description in the spec for the <internal-file> element
reads
"The <internal-file> element will contain the data for the skeleton
file." which is technically wrong, it may also contain data for an
<bin-source> or <bin-target> element.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC