Exactly. I believe internationalisation is something we have to sort out before we do our first release of STIX v2, but it is important that we discuss this as a community. The problem is that we have a lot of other things to discuss as well! So as Rob was saying we should get through some of the basics of seeing up the new objects and how the underlying protocols work before we discuss internationalisation.
It is important, it's just that we need to get through some of the work we have already outlined before we get to it.
This is an important issue, and I fully believe that internationalization needs to be a part of the 2.0 spec (most likely a part of CTI Common and all the specs actually). What Bret and others are suggesting is that we take this
up in the next tranche of work that is done. The initial piece of work we are doing is to get the actual Indicators work done, so we have a baseline. Once that is complete we can consider how to modify that to support internationalization.
Terry, Bret, all,
This (internationalization), I believe, is quite basic and I am very
concerned about it.
If we do not give a standard way to write CTI in different languages,
people start writing them in different ways and give a lot of headaches
as to interoperability.
Are there anyone else who (has a possibility to) write(s) CTI in
languages other than English? No concerns?
Yes, I would agree, lets add it to the next Milestone. Our first Milestone is to get the basic of the Indicator done.
Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."
JSON is UTF-8 as standard, so should support future internationalisation.
At some point we need to make a decision about if, when and how we support international languages, but I think that needs to be done as it's own individual tranche. We know from our previous conversations that its not a quick add, and that the community has different concerns, so my vote would be that we schedule it as a topic in an upcoming tranche and deal with it then - and release it as part of 2.0.
I think at the moment just getting the basic structures of the objects we need should be our focus.
I have a few questions as to internationalization.
Q1. Is the data encoded in UTF-8 or some other encoding
so that it can include Japanese and other languages?
the title and descriptions include its language code (such as jp, en, fr, ...)?
(It is, I believe, a good practice even if it is obvious. Automatic translation
system can use such information.)
Q3. If I were to provide translations to the title, descriptions, and
other human readable fields using relationships, how can I refer
Q4. Is it possible to have, for example, titles in multiple languages
from the start? (Ex. An Japanese entity creates a CTI piece
with Japanese/English titles from the beginning.)
We do still have a couple open questions:
Is it better to have one list of references (as we have in the text above), or multiple lists as we do in package? In other words, do we have one field called report_contains_ref and
it has references to indicators, relationships, threat actors, etc. or do we have a field for indicator_refs, another for relationship_refs, another for threat_actor_refs,
etc. We’ll also need to decide on the exact field names to use in either scenario.
Is there a need for a confidence field on report? It wasn’t there in 1.2, so this would be an addition, but at least Sean has noted that it would be useful.
Should title be required?
In STIX 1.2, there was a report intents field as a controlled vocabulary. Do we need this field, and if so, what should the list of values be? You can see this text now in the playground
I can’t think of a reason to include it, but I’m not really opposed. If we do include it we just need to clearly and carefully specify what the confidence field is describing confidence
for: that the collection of things are related in some way, that the collection of things belong to that title, etc.
Probably useful, and we need to think about what type of values we want to put in there. The current list of values is a mess.