Re: [cti] Idea for Internationalization

{
“title”: {
“en_us”: “…”,
“es”: “…”
},
“description”: [
{
“en_us”: “…”,
“es”: “…”
},
{
“en_us”: “…”,
“es”: “…”
}
]
}

Some open questions with embedding:

1) What happens when a system can not or chooses to not keep the extra language stuff?

2) How do you keep the original producer when you need to republish a TLO with an additional translation?

3) Will this open up threat intel to attacks by eliminating the chain or ownership?

4) What is going to happen to the versioning aspect of TLOs and how will we track that and all of their relationships?

Thanks,

Bret

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."

On Feb 8, 2016, at 07:09, Wunder, John A. <jwunder@mitre.org> wrote:

So to explore this a bit, were you imagining something like this:

{
“title”: {
   “en”: “…”,
   “es”: “…”
},
“description”: [
   {
     “en”: “…”,
     “es”: “…”
   },
   {
     “en”: “…”,
     “es”: “…”
   }
]
}

It’s a bit more indirect but like I said earlier, while it looks uglier I don’t think the code to read/write is much worse. There could also be a more flattened approach:

{
“title_en”: “…”,
“title_es”: “…”,
“description_en”: [“…”, “…”],
“description_es”: [“…”, “…”]
}

We would need to specify what those keys would be, and probably standardize on whether we use country-specific codes or not. I.e. Will the keys be “en-US” and “en-UK” or just “en”? And we would need to define a relationship for “translation” to support third party translations…”translation-of” would make sense to me.

Can anybody see any major issues with an approach like this? The biggest one I see is that if you have third-party translations you’ll run into the mess of relationships only pointing to the original or any given translation and need to work through that. (I.e. People create relationships to my translation of your object rather than directly to your object, bifurcating our intelligence.) Anybody producing or consuming third party translations concerned about that?

John

On 2/8/16, 8:55 AM, "cti@lists.oasis-open.org on behalf of Coderre, Robert" <cti@lists.oasis-open.org on behalf of rcoderre@verisign.com> wrote:

It makes complete sense to have translations available for top level objects, and I agree with Ryu that that it also makes sense to include the translations in the same object. In most cases (my subjective view) the translations will come from the same producer. If an independent third party is translating content, then it should be a separate object and referenced back to the original.

As for CybOX observables, I think these would be independent objects, primarily for the reasons Trey mentions, which is they are specific to a particular region/language and will have enough subtle differences as to warrant that distinction.

Rob

-----Original Message-----
From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Trey Darley
Sent: Monday, February 08, 2016 6:43 AM
To: Masuoka, Ryusuke
Cc: cti@lists.oasis-open.org
Subject: Re: [cti] Idea for Internationalization

On 08.02.2016 08:00:55, Masuoka, Ryusuke wrote:

May it be a title, a description, a filename, a subject of email,
etc., treating a translation as another property of the same object or
subproperty of the text object would be simpler and more natural than
treating the translation as another object.

For example, if it is a file object, it would be

-----
Case (A)
-----
File Object:
ID: A123
File Name (Original - JA): “医療費通知”
File Name (Translation - EN): “Medical expenses notice”
File Name (Translation - FR): “Frais médicaux Notez”
File Extension: PDF
Size in Bytes: 410,314
Hashes:
    Hash Name: SHA1
    Hash Value: 1234567890123456789012345678901234567890
-----

I was tracking along with this I18N discussion right up until now.
Does it make sense to provide translations of CybOX observables?

Taking Ryusuke's example, assume that I'm a threat actor using an identical malicious payload to target victims in multiple languages.
If I send out a phishing mail entitled "医療費通知", then the payload will be in Japanese. If I'm also targeting French-speakers, 1) the odds are minimal that I'll translate the file name exactly "Frais médicaux Notez" and even supposing that I do translate the filename exactly that way, the payload is going to be in French and so there's no chance in hell of the file hashes matching.

I18N makes total sense to me at the level of STIX TLOs with fields humans are likely to read. I don't see it providing much value at the CybOX observable level compared to the amount of complexity it will introduce.

We want to cater to humans, obviously, but if we make observables so complex as to practically preclude machine-parsing of them, then why not just send an old-fashioned email instead of using STIX/CybOX?

--
Cheers,
Trey
--
Trey Darley
Senior Security Engineer
4DAA 0A88 34BC 27C9 FD2B A97E D3C6 5C74 0FB7 E430 Soltra | An FS-ISAC & DTCC Company www.soltra.com
--
"In protocol design, perfection has been reached not when there is nothing left to add, but when there is nothing left to take away."
--RFC 1925

cti message