OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

cti message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: i18n (RE: MVP Discussion) - An Updated Proposal

Hi, John,

Thank you for your comments.

As for your major one, 

> Do we need to deal with i18n as a part of the STIX (or CybOX) standard itself? 
> Would it be possible to treat i18n-for-content as a separate standard?

I do not care as long as STIX and/or CybOX clearly mentions how they are 
going to do i18n. It can be within STIX and/or CybOX themselves or STIX/CybOX has 
a reference to a separate or external i18n standard, which meet requirements
defined by the Use Cases. One thing I would not like to see is that there is 
no mention in STIX and/or CybOX about i18n, that people do not know
how to deal with multiple languages and translations, and that they start 
doing those in their own ways. 

As for your minor one,

I am fine with ordinary hexadecimal MD5 hash.
I just picked Base64, thinking that many people might want 
translation stuffs as little as possible. 



-----Original Message-----
From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of John Anderson
Sent: Wednesday, April 20, 2016 10:03 PM
To: Mates, Jeffrey CIV DC3/DCCI; Masuoka, Ryusuke/益岡 竜介; cti@lists.oasis-open.org
Subject: [cti] Re: i18n (RE: MVP Discussion) - An Updated Proposal

First, let me say: I am very impressed by the work you've done on this. It's clear and clean.

I have one major comment, then some minor ones.

Major comment:

Do we need to deal with i18n as a part of the STIX (or CybOX) standard itself? Would it be possible to treat i18n-for-content as a separate standard?

Your idea, Ryu, is a very good idea, and it has uses outside our standard. So much data is translatable, but no one has really solved the "i18n for data" problem yet. At least, not in a standardized, reusable way. I think your idea could solve this problem for a much wider group of people.

Imagine that i18n-for-data was a separate standard, using your implementation. Anyone who wished to implement translations could just use it, without any change the STIX standard. It could be an implementation detail, and would not need to be tied down with any of the STIX overhead.

Minor comments:

I do agree with Jeffrey about the `text_ref` field. As a programmer, I was surprised to see the MD5 value encoded in Base64. It's not necessarily a bad idea, and it does save bytes, but it was surprising. It might be less surprising for most people to use the MD5 as-is, and not encode it in Base64.

Two-character language codes exclude minority languages. Could we use the ISO standard 3-character language codes, please? https://www.loc.gov/standards/iso639-2/php/code_list.php

Jeffrey points out that original text may contain multiple languages. However, let's not drop the original text language code. It is needed by translators and translation tools, to identify the original source's language (even if it's not a perfect match).

John Anderson

(thread trimmed out of courtesy)
To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at:

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]