OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [cti] RE: i18n (RE: MVP Discussion)


Hi, Bret, all,

 

Sorry that it took me some time.

I think that I got a fairly simple and straight-forward solution.

It is Brets idea, just giving a reference to the text itself,

instead of reference through object structure.

I also added one more use case, (7) CTI provider and added

description to the use case, (6) CTI translation service.

 

I know time is limited, but I thought for this new set of

standards, we are taking big steps to lay a foundation for future.  

If we aspire to take those CTI standards to ISO eventually,

I believe that i18n is one of strategic and important elements and

that it should not be after-thought ad-hoc patches here and there.

(Especially if it is not a simple fix.)

 

Regards,

 

Ryu

 

----------

Internationalization Rules

-----------

- Always give "text_id" and "lang" for every text field

  (So that anyone can give translations to the field later, knowing

  which language it is in.)

 

- Always give "text_ref", "text_id" and "lang" for every translation

  ("text_id" is for someone to provide translations to other than one in the original language.

  Example: A CTI text field created in Japanese, then it is given an English translation.

    Then German and French translations are produced based on the English translation.)

 

- Use the same text_id for the text field if it stays the same over different versions

- Always use translation to give a translation

  If there are multiple translations for a text, have multiple translations

 

-----

- Pattern A - Translation given inside the same package

{

  "type": "package",

  ...

  "campaigns": [

    {

      "type": "campaign",

      "id": "campaign--a1201df6-c352-4a81-9c7c-5a6f896a4e31",

     "revision": 1,

      "spec_version": "stix-2.0",

      "created_at": "2015-12-03T13:13Z",

      "created_by_ref": "identity--69a17e1b-bb45-4657-9a9d-96db3faccdde",

      "title": {

        "text_id": "text-a1b2c3",

        "lang": "en",

        "value": "Dridex Campaign - Botnet 121"},

      "descriptions": {

        "text_id": "text-d4e5f6",

        "lang": "en",

        "value": "Dridex-based campaign leveraging Botnet 121"},

      "intended_effects": [

        {"value": "theft-identity-theft"}

      ],

      "status": "Ongoing"

   }

  ],

  "translations": [

    {

      "text_ref": "text-a1b2c3",

      "lang": "ja",

      "text_id: "text-a1b2c3-ja-1",

      "value": "Dridex キャンペーン - ボットネット 121"

    },

    {

      "text_ref": "text-d4e5f6"

      "lang": "ja", 

      "text_id": "text-d4e5f6-ja-1"

      "value": "ボットネット 121 を活用する Dridex を元にしたキャンペーン"

    },

    {

      "text_ref": "text-a1b2c3",

      "lang": "de",

      "text_id: "text-a1b2c3-de-1",

      "value": "Some German Title

    },

    {

      "text_ref": "text-d4e5f6",

      "lang": "de", 

      "value": "Some German Description"

    }

  ]

  ...

}

-----

- Pattern B - Translation given in some external sources

{

  "translations": [

  {

    "text_ref": "text-a1b2c3",

    "lang": "es",

    "text_id": "text-a1b2c3-es-1", 

    "value": "Some Spanish Title

  },

  {

    "text_ref": "text-d4e5f6",

    "lang": "es",

    "text_id": "text-d4e5f6-es-1",

    "value": "Some Spanish Description"

  },

  {

    "text_ref": "text-a1b2c3",

    "lang": "fr", 

    "text_id": "text-a1b2c3-fr-1",

    "text_value": "Some French Title"

  },

  {

    "text_ref": "text-d4e5f6",

    "lang": "fr", 

    "text_id": "text-d4e5f6-fr-1",

    "value": "Some French Description"

  }

  ]

}

-----

 

----------

Notes - Simplicity, coherency, consistency

----------

- Only one way to express a text field

- Only one way to give translations

  (It can be inside the package or sitting in external sources.)

- Additional ~50 bytes + byte length of text-id for each text field

- Resources spent for translation will not be wasted

  as long as the text stays same.

- Use Case (1) -> Patten A

- Use Case (2) -> Pattern A or B

  The text always has its language code

- Use Case (3) -> Pattern B

- Use Case (4) -> Using UTF-8

- Use Case (5) -> Using UTF-8

- Use Case (6) -> Pattern B

- Use Case (7) -> Pattern A (Dynamically produced from its DB)

 

----------

 

------------------------------

Internationalization Use Cases

------------------------------

CN: Chinese

DE: German

EN: English

FR: French

JA: Japanese

 

------------------------------

(1) Providing an object texts in multiple languages simultaneously at the time of creation.

------------------------------

 

  [ja/en (in case of Japan), en/fr/de (in case of EU countries), etc.]

 

This is the most likely use case (for me). The original CTI has titles/descriptions in

multiple languages from the start. When you create a CTI file, you include

both English and Japanese titles/descriptions for major objects in it

so that non-Japanese speaking people can at least find out what it is at the top level.

 

------------------------------

(2) CTI Database Receiving CTI from Multiple CTI Sources in Different Languages

------------------------------

 

This is a case where you receive CTI from a English CTI source and

another CTI source in Japanese.

You put all CTI into MongoDB or some other No-SQL Database and

would like to do mix and match. I would like the CTI Database still

can track the language code of textual fields.

 

------------------------------

(3) EN CTI received by a Japanese entity, which provides EN translation

  (Or vice versa, JA CTI received by a US entity, which provides EN translation

------------------------------

 

  A Japanese entity receives CTI information pieces in English.

  The entity determines some of them are important/critical

  and worth translating them into Japanese, add descriptions in Japanese

  and redistribute them to other Japanese entities (if redistribution is allowed).

  The CTIM (CTI Management System) of a receiving party displays

  the Japanese description whenever possible, while allowing access to

  the original English descriptions."

 

  Work Flow:

  1. Company 1 in EN creates an Indicator and TTP and shares them to Company 2 in JP. 

    It is important to note that the flow may be direct or may be through a series of brokers and other entities. 

    1. This Indicator and TTP has a producer of Company 1 and a version of 1

  2. Company 2 builds a translated version of the TTP and Indicator and releases it.

    1. This new Indicator and TTP has a producer of Company 2 and a version of 2. 

    2. It is unrealistic to think that Company 2 can or will share the translated object back to Company 1 and that if Company 1 gets the translated object that they will do anything with it.  Their legal departments will probably prohibit accepting 3rd party translations and then using them in their offerings.

 

------------------------------

(4) An English CTI report describing attacks against Japanese entities in EN 

------------------------------

 

  An English report on Cyber Attacks on Japan.

  There are filenames of lure attachments in Japanese (original/real) and their

  translations in English.  Another similar report in English might have an email title along with

  its translation in English next to it. That report also has a Windows pathname

  in Chinese (not Japanese) found in a binary along with its translation in English.

 

  These Japanese texts can be found in descriptions, not just

 

  [Ex. Original File Name (JA): "医療費通知", Translated File Name (EN): "Medical expenses notice"]

 

  Note: This should probably be okay as long as the standards require use of UTF-8 for encoding.

 

------------------------------

(5) Email subject/body, supposed to be in JP, but includes CN characters (by mistake of the attackers)

------------------------------

 

  This can happen due to Chinese/Japanese/Korean sharing Unicode characters

  (CJK characters - https://en.wikipedia.org/wiki/CJK_characters.)

 

  This can be a very important clue as to the attackers.

 

  Note: This should probably be okay as long as the standards require use of UTF-8 for encoding.

 

------------------------------

(6) CTI translation service

------------------------------

 

  A CTI translation service provider keeps translations to target languages of text fields

  from publicly available and/or commercial/private CTI sources.

  The service is available through some kind of online API.

  Consumers of this translation service will use this service to translate text fields

  in their CTI system through the API provided by the translation service provider.

 

------------------------------

(7) CTI provider

------------------------------

 

  A CTI provider (in English) plans to penetrate the Japanese and other APAC markets

  and needs a standard way to add translations of their text fields.

  The CTI provider gives its customer a CTI package with all the translations in it

  or a CTI package with translations to the languages of user's choosing. 

 

------------------------------

 

From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Jordan, Bret
Sent: Wednesday, April 13, 2016 2:40 AM
To: Jason Keirstead
Cc: John-Mark Gurney; Masuoka, Ryusuke/
益岡 竜介; cti@lists.oasis-open.org
Subject: Re: [cti] RE: i18n (RE: MVP Discussion)

 

Given this debate, and the fact that it appears that we will not resolve this with a simple fix, I would suggest that this NOT be MVP.  I was hoping for an easy solution.  But apparently that is not possible.  

 

I also want to make sure we do not mix up the way a product might do this with how it needs to be conveyed across the wire.  A tool could use the text once, and reference it all over the place.  But that does NOT mean it needs to go over the wire that way. 

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

On Apr 12, 2016, at 10:33, Jason Keirstead <Jason.Keirstead@ca.ibm.com> wrote:

 

I think what Ryusuke is pointing out is that having a translation point to a specific revision does not solve his use case, which is a common one when i18n'ing software - as I know first hand, translation en-masse is very expensive and therefore you do not want to re-translate the same text multiple times. If you have the same title text used in 100 objects, you would want to only translate that text once and reference it 100 times.

This is what he is asking for - a way to tie the translation not to an object version, or even an object instance - but to the text property itself.

See Java resource bundles, or Gettext PO files, for example. His ask is for the equivalent to resource bundles and keys for STIX.

-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


<graycol.gif>John-Mark Gurney ---04/12/2016 01:25:16 PM---Masuoka, Ryusuke wrote this message on Tue, Apr 12, 2016 at 10:05 +0000: > Thank you for your sugges

From: John-Mark Gurney <jmg@newcontext.com>
To: "Masuoka, Ryusuke" <masuoka.ryusuke@jp.fujitsu.com>
Cc: "Jordan, Bret" <bret.jordan@bluecoat.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Date: 04/12/2016 01:25 PM
Subject: Re: [cti] RE: i18n (RE: MVP Discussion)
Sent by: <cti@lists.oasis-open.org>





Masuoka, Ryusuke wrote this message on Tue, Apr 12, 2016 at 10:05 +0000:
> Thank you for your suggestion.
> I think it works and meets my needs.
>
> One thing I am afraid of is that it might be brittle against versioning.
> It refers to an object and it is not useful once the object gets updated
> even if the object uses the same texts.

This is why the translation should point to a specific revision of the
object...  See minor modification below...

I believe I addressed this in my example that I sent a while back...

> From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Jordan, Bret
> Sent: Monday, April 11, 2016 7:09 AM
> To: Masuoka, Ryusuke/
益岡 竜介
> Cc: cti@lists.oasis-open.org
> Subject: [cti] Re: i18n (RE: MVP Discussion)
>
> Ryu,
>
> From your example #1, it would probably need to look something like this.  This would enable a tool to:
> a) translate any text, not just certain fields
> b) allow you or anyone else to create a translation at any point in time.
>
> Please let me know if this would not meet your needs... In reading through your use-cases, which are good BTW, I believe this would address all of your points.
>
>
>
> {
>   "type": "package",
>   ...
>   "campaigns": [
>     {
>       "type": "campaign",
>       "id": "campaign--a1201df6-c352-4a81-9c7c-5a6f896a4e31",
>       "lang": "en",
       "revision": 1,
>       "spec_version": "stix-2.0",
>       "created_at": "2015-12-03T13:13Z",
>       "created_by_ref": "identity--69a17e1b-bb45-4657-9a9d-96db3faccdde",
>       "title": "Dridex Campaign - Botnet 121",
>       "descriptions": "Dridex-based campaign leveraging Botnet 121",
>       "intended_effects": [
>         {"value": "theft-identity-theft"}
>       ],
>       "status": "Ongoing"
>    }
>   ],
>   "translations": [
>     {
>       "type": "translation",
>       "id": "translation--a1201df6-c352-4a81-9c7c-5a6f896a1111",
>       "lang": "jp",
>       "spec_version": "stix-2.0",
>       "created_at": "2015-12-03T13:13Z",
>       "created_by_ref": "identity--69a17e1b-bb45-4657-9a9d-96db3faccdde",
       "translated_ref": [ "campaign--a1201df6-c352-4a81-9c7c-5a6f896a4e31", 1 ],
>       "translated_text": [
>         "title": "Dridex
キャンペーン - ボットネット 121",
>         "description": "
ボットネット 121 を活用する Dridex を元にしたキャンペーン"
>       ]
>     },
>     {
>       "type": "translation",
>       "id": "translation--a1201df6-c352-4a81-9c7c-5a6f896a2222",
>       "lang": "de",
>       "spec_version": "stix-2.0",
>       "created_at": "2015-12-03T13:13Z",
>       "created_by_ref": "identity--69a17e1b-bb45-4657-9a9d-96db3faccdde",
       "translated_ref": [ "campaign--a1201df6-c352-4a81-9c7c-5a6f896a4e31", 1 ],
>       "translated_text": [
>         "title": "Some German Title",
>         "description": "Some German Description"
>       ]
>     }
>
>   ]
>   ...
> }

--
John-Mark

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 



 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]