cti message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: Re: [cti] Internationalization: lang field required or optional?
- From: "Jason Keirstead" <Jason.Keirstead@ca.ibm.com>
- To: Bret Jordan <Bret_Jordan@symantec.com>
- Date: Wed, 1 Mar 2017 12:43:44 +0000
Perhaps we could lobby to get it added
to the IANA registry?
I do not actually understand why it
is defined in ISO 639, but not in the IANA registry.
> In an effort to help defend Ryu's
use cases, if we make it required, then over time it is more likely that
UIs will start to incorporate the language tag in to their design. If
we > make it optional, then it is highly unlikely that will take off
in mass. You will have a few groups here and there that will do it,
but the rest will just ignore it.
The problem is, this tag is not actually
required in the majority of use cases - it is just bytes flying over the
wire for no reason. I really dislike that.
If we are going to go down this path,
then I would propose it be done at the STIX package level at least, or
perhaps the TAXII level... somewhere I can define a "default language"
for my ecosystem without having to add 12 superfluous bytes to every single
STIX object.
-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security| www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
From:
Bret Jordan <Bret_Jordan@symantec.com>
To:
"Wunder, John
A." <jwunder@mitre.org>, Jason Keirstead/CanEast/IBM@IBMCA,
"CREEDON, Gus" <GCREEDON@lmi.org>
Cc:
Allan Thomson <athomson@lookingglasscyber.com>,
"cti@lists.oasis-open.org" <cti@lists.oasis-open.org>,
"Back, Greg" <gback@mitre.org>, "Masuoka, Ryusuke"
<masuoka.ryusuke@jp.fujitsu.com>
Date:
02/28/2017 10:26 PM
Subject:
Re: [cti] Internationalization:
lang field required or optional?
In an effort to help defend Ryu's use cases,
if we make it required, then over time it is more likely that UIs will
start to incorporate the language tag in to their design. If we make
it optional, then it is highly unlikely that will take off in mass. You
will have a few groups here and there that will do it, but the rest will
just ignore it.
If we make it required, then we might get
some people doing stupid stuff and defaulting it to "en". But
how is that different or worse then what we have today? Further,
I think we can use the RFC but say, in the event that you do not know the
language, you MUST use "und". No need to go to the ISO
version. We can use these RFCs how ever we want.
Bret
From: Wunder, John A. <jwunder@mitre.org>
Sent: Tuesday, February 28, 2017 1:02:20 PM
To: Jason Keirstead; CREEDON, Gus
Cc: Allan Thomson; Bret Jordan; cti@lists.oasis-open.org; Back, Greg;
Masuoka, Ryusuke
Subject: Re: [cti] Internationalization: lang field required or optional?
We could always introduce our own “unknown”
value, but that feels identical to making it optional except we have more
bytes on the wire -- the same people who would have left it off won’t
set it to some useful value, they’ll just make it unknown.
FWIW I tried to collect where we stand
across Slack and e-mail:
Optional: myself (MITRE), Jason (IBM),
Allan (LookingGlass), Greg Back (MITRE), Wouter (eclecticIQ), Alexandre
(MISP), JMG (NewContext), Lauri (cyberdefense)
Required: Bret (Symantec), Ryu (Hitachi),
Rob (iDefense), Gus (LMI), Trey (Kingfisher)
In terms of normative statements, if we
do end up keeping it optional, perhaps we could strengthen it to: “The
lang property SHOULD be present when the language of the content is known.”
In the STIX 2 validator, SHOULD statements
will throw a warning, plus there’s a --strict flag you can pass to change
those warnings into errors. So if we add this, I feel like it’s much more
than “optional”…it’s you should do this unless you have a good reason
not to. So with the field “optional”, the validator would still throw
a warning if it’s missing.
Anybody else want to chime in on this?
John
From: Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Date: Tuesday, February 28, 2017 at 12:21 PM
To: "CREEDON, Gus" <GCREEDON@lmi.org>
Cc: Allan Thomson <athomson@lookingglasscyber.com>, "Bret
Jordan (CS)" <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org"
<cti@lists.oasis-open.org>, Greg Back <gback@mitre.org>, John
Wunder <jwunder@mitre.org>, "Masuoka, Ryusuke" <masuoka.ryusuke@jp.fujitsu.com>
Subject: RE: [cti] Internationalization: lang field required or optional?
It is not going to be hard at all coming
up with use cases that generate optional conditions. In fact, I suspect
this is going to be the majority (which is why it should be optional).
Example - I have a cloud-based TIP that is heavily UK based, and thus my
user accounts do not prompt them to specify their language (the product
UI is English only). Someone enters an indicator into the platform and
shares it out. I have *NO IDEA* what language that Indicator description
is written in... you may *assume* it is English, but I really do not know,
because I didn't ask the user... maybe they typed in French or Spanish,
who knows. When I share that Indicator out, the language field *should
not* be "en" or "en-GB", because I have no idea what
the language actually is - it should be empty, or "undefined".
But as I pointed out the other day, unfortunately the IETF has not decided
to adopt the "und" language code from ISO! So if we don't make
the field optional, then we can't use RFC5646anymore
and have to switch to ISO 639-X.
-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security|
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
From: "CREEDON,
Gus" <GCREEDON@lmi.org>
To: "Masuoka,
Ryusuke" <masuoka.ryusuke@jp.fujitsu.com>, "Back, Greg"
<gback@mitre.org>, Jason Keirstead/CanEast/IBM@IBMCA, "Allan
Thomson" <athomson@lookingglasscyber.com>
Cc: Bret
Jordan <Bret_Jordan@symantec.com>, "Wunder, John A." <jwunder@mitre.org>,
"cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Date: 02/28/2017
10:58 AM
Subject: RE:
[cti] Internationalization: lang field required or optional?
Greetings,
If we do not make it REQUIRED, then we may be looking at a lot of work
coming up with use cases that generate OPTIONAL conditions.
The terms identified in RFC 2119 allow for conditions.
Parsing a sentence from STIX 2.0, 3.4 Versioning, we do assign a condition
to the ‘MUST instead create’ phrase:
“If a producer other than the object creator wishes to create a new version,
they MUST instead create a new object with a new id.”
So let’s say we go with
OPTIONAL … MUST/SHALL…
These are somewhat convoluted but:
OPTIONAL – the lang: field MUST/SHALL be used in the STIX message if the
producer intends to enable consumers to accelerate the language identification
process.
OPTIONAL – the lang: field MUST/SHALL be used in the STIX message if the
producer broadcasts to consumers who reside across a sovereign border.
Ryu asks if it’s worth spending time defining use cases?
If we don’t intend to make lang: REQUIRED, then we need to develop conditions
to satisfy the business/use case and express them in the object field.
Again, that could turn into a lot of work and overly complicate the tool
developer’s UI if they want to Q&A their way through the options with
the user.
IMHO, tool providers can easily accommodate this field in their UI and
in the interchange.
How tool providers enhance their user experience is not the CTI TC’s concern.
I believe, “REQUIRED – MUST be filled in with a valid code”, is the
better choice.
Gus
Gus Creedon
7940 Jones Branch Drive, Tysons, VA 22102
Office: (703)917-7272 | Cell: (571)335-6899
From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org]
On Behalf Of Masuoka, Ryusuke
Sent: Monday, February 27, 2017 3:21 AM
To: Back, Greg <gback@mitre.org>; Jason Keirstead <Jason.Keirstead@ca.ibm.com>;
Allan Thomson <athomson@lookingglasscyber.com>
Cc: Bret Jordan <Bret_Jordan@symantec.com>; Wunder, John A. <jwunder@mitre.org>;
cti@lists.oasis-open.org
Subject: [EXTERNAL] RE: [cti] Internationalization: lang field required
or optional?
Hi,
I think the differences are in use cases in each one’s
mind.
Human readable texts are for humans to consume, but
“lang:”tag
is for the system to produce/consume.
This is an on-the-wire/between-systems requirement/optionality.
With the system knowing the language code for the human readable
texts, the system can handle things better and provide much better UI,
etc.
My question is what is worth (use cases) to define lang: tag if it is optional.
Regards,
Ryu
From: cti@lists.oasis-open.org[mailto:cti@lists.oasis-open.org]
On Behalf Of Back, Greg
Sent: Friday, February 24, 2017 11:05 PM
To: Jason Keirstead; Allan Thomson
Cc: Bret Jordan; Wunder, John A.; cti@lists.oasis-open.org
Subject: Re: [cti] Internationalization: lang field required or optional?
I originally didn’t feel strongly either way, but I’m coming around to
feeling pretty strongly it should be optional.
Language is necessary only for human consumption (vs. encoding, which is
necessary for machine consumption). IMO, fields should only be required
if leaving them off makes effective CTI sharing difficult, and I don’t
(yet) think this is true for language information. It’s certainly we can
specify in conformance levels or interoperability profiles, but I feel
it would be a mistake to require it at the spec level.
As I’ve been working on python-stix2, creating an Indicator only requires
“labels” and “pattern”. All other required fields (type, id, created,
modified, valid_from) can be reasonably inferred. Any program that uses
python-stix2 needs to therefore require the user to enter that information,
or make an assumption on their behalf. Getting the “current user’s”
language works fine on personal machines, but on a server that many people
use (for example, via a web service), it’s problematic.
Also, a field doesn’t need to be required if we define how consumers should
behave when it’s missing; in this case, saying that the language is “undefined”
or “unspecified” is likely OK, particularly that “unspecified” is OK
for machine-to-machine communication that doesn’t involve humans. This
is the reason I’ve always felt “modified” should be optional; IMO it’s
perfectly reasonable to mandate that, if not explicitly specified in JSON,
consumers MUST assume it was last modified at the “created” date.
Greg
From: <cti@lists.oasis-open.org>
on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Date: Friday, February 24, 2017 at 7:15 AM
To: Allan Thomson <athomson@lookingglasscyber.com>
Cc: Bret Jordan <Bret_Jordan@symantec.com>,
John Wunder <jwunder@mitre.org>,
"cti@lists.oasis-open.org"
<cti@lists.oasis-open.org>
Subject: Re: [cti] Internationalization: lang field required or optional?
I also agree with Alan and John in the preference to make this optional.
In general I do not like sending bytes when bytes are not required in a
data interchange format, especially when considering the scale of data
we will be dealing with in STIX/TAXII. We should be looking for opportunities
to keep the data format trim. Truthfully, the vast majority of data in
an ecosystem will all be the same language, and thus having to transmit
a language tag for every single object in a package is redundant information.
There is also another issue with making it "required", and that
is that we would then have to support "unknown" or "undefined"
- which many products would have to mark content as since they may not
know the producer of the content's native language. There is an ISO
639 language tag for "undefined", but there is no IETF tag for
"undefined" in the IANA registry, they never adopted the ISO
entry. So making this mandatory may force a revisit of the RFC5646decision.
-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security|
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
From: Allan
Thomson <athomson@lookingglasscyber.com>
To: Bret Jordan
<Bret_Jordan@symantec.com>,
"Wunder, John A." <jwunder@mitre.org>,
"cti@lists.oasis-open.org"
<cti@lists.oasis-open.org>
Date: 02/23/2017
07:01 PM
Subject: Re:
[cti] Internationalization: lang field required or optional?
Sent by: <cti@lists.oasis-open.org>
If you are expecting to use different language content then its required
for interoperability reasons.
But by marking it required in the spec means that all content must have
it even when most content is not multi-language.
I generally would prefer more tolerance in the spec level and let the products/market
use good behavior to drive what fields are included or not.
If people care about language and multi-language support then they will
use it. If they don’t then they wont be interoperable as that will be
part of the test in the interop spec.
allan
From: Bret Jordan <Bret_Jordan@symantec.com>
Date: Thursday, February 23, 2017 at 2:04 PM
To: Allan Thomson <athomson@lookingglasscyber.com>,
"Wunder, John" <jwunder@mitre.org>,
"cti@lists.oasis-open.org"
<cti@lists.oasis-open.org>
Subject: Re: [cti] Internationalization: lang field required or optional?
My thoughts....
1) In reality we are talking about a feature not a property.
2) If it is property of this feature is optional, then the only products
that will implement this feature, are those that care about internationalization.
3) If it is required, then everyone will be forced to implement it.
Personally I see this as a data quality issue, not a STIX issue. And
I think both sides can suffer from it.
Problems with Required:
a) product or tool does not care, does not provide a UX for it, and just
hard codes it to something, say "en"
b) product or tool does provide a UX for it, but analyst does not care
and it just remains what ever the default is.
Problems with Optional:
a) product or tool does not care, does not provide a UX for it, and just
leaves it out of the data. So it is undef.
b) product or tool does care and provides a UX for it and the analyst does
not care and leaves it blank.
c) Broker product or tool takes in data that has a lang tag, but they do
not support that feature so they never implemented it. So when the
data goes back out the other side, the language tag is now missing.
I personally do not see the harm in requiring tools to support and populate
the Lang tag. In the spec we can define an "unknown" value,
so if you are doing bulk loading of data and you honestly do not know the
language, you could just flag it as "unknown". Then at
least as the consumer you would know that the producer did not know the
language. Versus getting an object where the language tag is omitted
and you do not know if:
i) they did not know the language
ii) there tool did not support it
iii) they were just lazy and did not add it.
Once again, this is a data quality problem and if we make the lang field
required, then it is a SUPER EASY interop test to see if they do it right.
If it is optional, then you are just at a guess all the time.
Bret
From: cti@lists.oasis-open.org<cti@lists.oasis-open.org>
on behalf of Allan Thomson <athomson@lookingglasscyber.com>
Sent: Thursday, February 23, 2017 2:29:59 PM
To: Wunder, John A.; cti@lists.oasis-open.org
Subject: Re: [cti] Internationalization: lang field required or optional?
Prefer optional.
From: "cti@lists.oasis-open.org"
<cti@lists.oasis-open.org>
on behalf of "Wunder, John" <jwunder@mitre.org>
Date: Thursday, February 23, 2017 at 12:59 PM
To: "cti@lists.oasis-open.org"
<cti@lists.oasis-open.org>
Subject: [cti] Internationalization: lang field required or optional?
Hey everyone,
We’re getting very close to having a completed approach for internationalization,
you can see the full writeup here: https://docs.google.com/document/d/15qD9KBQcVcY4FlG9n_VGhqacaeiLlNcQ7zVEjc8I3b4/edit#heading=h.61fy0hlsdirz
We do have one remaining question before we can move forward though. As
part of the proposal, every single top-level object has a “lang” field,
that identifies the language of the text content in that object. What we
need to decide is whether we make that field required or optional.
If we make the field required, every top-level object in STIX (SDOs and
SROs) would have to have a “lang” field in it or it would be invalid
STIX. If we make it optional, producers could either include the field
or not.
Here are some thoughts:
Making it required:
- All
SDOs and SROs would have a language tag, so consumers could depend on it
being there
- It
would encourage producers to actually fill it out, because they wouldn’t
be creating valid STIX otherwise
- It
shows we have a commitment to internationalization
Making it optional:
- Any
SRO or SDO could have a language tag, so consumers could not depend on
it
- Producers
would not have to create it
- We
do have a SHOULD requirement saying that it should be included
My opinion is that we should make it optional. If it’s required, I think
people who don’t want to do internationalization (especially those creating
one-off scripts or open source tools) will hardcode it to English and things
will be mislabeled. If it’s optional, I think those who need/want to support
internationalization and would do it right (most/all vendors, major open
source projects) will populate it correctly regardless…because they need
it…while those who couldn’t be bothered will be able to leave it off
and we won’t have mis-labeled data. Also it’s almost not worth saying,
but we already have a bunch of required fields on every SDO/SRO and I’ve
already had one conversation with someone who said there’s a lot of bloat…would
like to avoid adding to that.
Anyway, what does everyone think…required or optional?
John
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]