Re: [cti-stix] STIX: Messaging Standard vs. Document Standard

John

On Nov 30, 2015, at 8:42 AM, Jason Keirstead <Jason.Keirstead@CA.IBM.COM> wrote:

+1 to all below recommendations... exactly my line of thinking.

It may or may not be more work to undertake these two parallel efforts - but I believe that it would allow both efforts to more forward in a faster and more coherent way than the current methodology.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown

<graycol.gif>"Baker, Jon" ---11/30/2015 09:36:44 AM---+1 Thanks for thinking through the underlying issues that might be making it so hard to achieve cons

From: "Baker, Jon" <bakerj@mitre.org>
To: Jason Keirstead/CanEast/IBM@IBMCA, "cti-stix@lists.oasis-open.org" <cti-stix@lists.oasis-open.org>
Date: 11/30/2015 09:36 AM
Subject: RE: [cti-stix] STIX: Messaging Standard vs. Document Standard
Sent by: <cti-stix@lists.oasis-open.org>

+1

Thanks for thinking through the underlying issues that might be making it so hard to achieve consensus. I completely agree that by trying to develop a messaging standard and a document standard in one effort is a significant source of frustration for this group. This is how I have thought about this issue:

STIX has two primary use cases
• UC1: Holistic cyber threat analysis
• UC2: Exchange cyber threat information
Requirements for UC1 are not always conducive to effective information exchange

My basic recommendation would be as follows:

Differentiate analysis and sharing requirements
• avoid overloading analysis model with exchange requirements
• avoid overloading exchange with analysis requirements

Develop a high level model of cyber threat intelligence for analysis
• initially in UML, but a semantic representation can be developed

Develop messages tailored to information exchange needs
• each exchange has a formal specification
• ensure messages are compatible with the analysis model
• allow protocol and serialization to be dictated by information exchange needs
• initially specify only a few well known and well defined messages
• plan for many messages, but add messages over time as real needs are understood

Thanks,

Jon

============================================
Jonathan O. Baker
J83D - Cyber Security Partnerships, Sharing, and Automation
The MITRE Corporation
Email: bakerj@mitre.org

From: cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jason Keirstead
Sent: Thursday, November 26, 2015 8:47 AM
To: cti-stix@lists.oasis-open.org
Subject: [cti-stix] STIX: Messaging Standard vs. Document Standard

When I originally started this message, I had started it with a "here is why I am against JSON-LD" stance, but then decided to take a step FAR BACK and try to figure out / tease apart the fundamental reasons why people are both for and against JSON-LD. As a result of my analysis, I think am starting to figure out why there are two diametrically opposed camps here.

The root I believe is that there is a fundamental disconnect between an ideal messaging standard and a document standard, yet STIX is trying to serve both masters. I am not sure that it can, and keep everyone happy. At any rate, I hope if everyone can read through the below, it will at least help each camp start to see the other's point of view.

Things desired in a document standard:

- Clarity of the source and meaning of the data
- Readability by humans can sometimes be a factor depending on use cases
- Byte-efficiency is a secondary or tertiary concern (disk is cheap)

In a document standard, it is now the standard practice that the schema accompanies the document. This is the core tenant of JSON-LD and other related semantic technologies - that your data is annotated in a way such that it can be linked back to the schema that defined it, which then also allows you to infer the semantic meaning behind fields in the document. This lets people and systems cross-correlate and search documents of different types that contain fields that are related semantically, without having to have standard-specific code written for them.

Things desired in a messaging standard:

- Maximum byte efficiency (bandwidth is not cheap)
- Absolutely zero ambiguity
- Readability by humans is a secondary (or tertiary) concern, sometimes not a concern at all

In a messaging standard, the schema has no reason to accompany the message, because anyone who implements it would have zero ambiguity anyway, and doing so greatly inflates the size of the messages. You also don't have to infer meaning of a field in a messaging standard, because the meaning is fixed and is not open to any interpretation. As such, semantic technologies are not required in a messaging standard, because they aren't even applicable to the use case.

The root of our problem here and I believe why we can not come to consensus, is we are trying to come up with one standard that does both things, which are actually philosophically opposed to each-other. There is an extremely large community of people and systems who want to "speak STIX", but they have no plans to STORE STIX, and this could not care less about semantic representations. Similarly, there is a large community of people and systems who want to (and already have) systems with large STIX warehouses, and very much care about semantic representations, so that they can tie that data to other systems.

Maybe we should take a step back and look at this more critically. If you look at what people care about from a "frequently messaged" perspective (namely of indicators and observable occurrences) maybe that should be moved under TAXII? Currently, TAXII is just a transit protocol and the standard of the messages is simply " a STIX document". I am starting to think that this is not enough and it's part of why we can't reach any consensus. There is no reason that there could not be a messaging format in TAXII to communicate indicators and observables that was an offshoot of STIX but not STIX itself... meanwhile there could continue to be a channel for full/complete "STIX documents" which are transmitted with much less frequency.
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown

cti-stix message