OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-stix message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [cti-stix] STIX: Messaging Standard vs. Document Standard


Bret,

I will answer your questions, below [cbc] , but perhaps we will then agree to disagree and let the process work. I doubt more needs to be said.

-Cory

 

 

 

From: Jordan, Bret [mailto:bret.jordan@bluecoat.com]
Sent: Monday, November 30, 2015 7:13 PM
To: Cory Casanave
Cc: Jason Keirstead; Richard Struse; cti-stix@lists.oasis-open.org; Wunder, John A.
Subject: Re: [cti-stix] STIX: Messaging Standard vs. Document Standard

 

1: Definition method

Bret: The specification is English prose.

Cory: The specification is a machine readable model that includes English prose.

 

How is this an issue?  The generalities that there will be problems is vague and I am not sure how it applies to this specification.  

[cbc] Well, it has been a huge issue in my own attempt to understand STIX and map it. Some things still make no sense. It is well documented that there are multiple ways to say the same thing – will all the implementations work together? Perhaps I can find some time to document some of the “WTF” questions that came up as I looked as STIX-1. When prose specifications are interpreted differently you get very expensive and hard to resolve BUGS. You get systems under the same standards that don’t work together. To me, this is a problem.

 

When STIX moves to Cap-n-Proto (aka Binary) there will be no more english fields names,

[cbc] English field names are not required. What is required is a programmatic way to go from instance to specification. This can be done in binary.

so how is this an issue? HTTP and HTML being english centric seem to have worked well.  A specification is a specification.  Building unit-test to test compliance is a relatively easy thing to do.  

[cbc] Wow, you must be really good. Compliance has been hard for most specifications.

This will guarantee interoperability.  

[cbc] Trouble is, it does not. Interoperability is hard.

And if one vendors (internet explorer 6) comes out that breaks the eco-system, then consumers should not buy that product and force that vendor to change.

 

2: Schema production

Bret: The field names and structure are hand crafted.

Cory: The field names and structure are produced from the model.

 

Organizations and development shops will always produce their own APIs to generate STIX content.  

[cbc] That would be unfortunate for wide-scale adoption.

 

Some may use community build modules / APIs, depending on the licensing and intellectual property aspects.  It is very easy to build compliance and unit-testing to verify that what someone produces will match the specification.  

[cbc] So do we have that for STIX 1? Ask Oasis about ease of conformance suites.

 

STIX is not that big.  

[cbc] STIX and all it imports is thousands of terms. What is big to you? Or, are you assuming a much reduced scope? If so, the scope question should be #1!

 

I built an API to do all of the indicators and TTP stuff in a few days.  I would argue that the best thing we could do would be to present a text document form the UML that listed out each field name by idiom.  Then developers can just copy and paste the entire list.  This way there will be no-type'os.  But once again, a simple unit-test will pick up any issues.

[cbc] I think we have it on a key point – “Idioms”. Idioms are examples, not specification’s. Coding to an idiom would be very fragile and would then not interoperate with others who coded to other idioms that utilize the same or overlapping data.

By the way, since you will copy/paste the field names I’m not sure why the introduction of a namespace prefix is such an issue, it would have zero development in inconsequential runtime overhead.

 

 

3: Namespaces

Bret: The tag names in the data are implicitly mapped to the schema by name

Cory: The tag names are explicitly mapped to their schema and definition by name and explicit namespace

 

I disagree.  In the UML it is very easy to see in the 20 items for each idiom if we have re-used the same name more than once.  

[cbc] Again, Idioms are irrelevant. We need to look at all the terms that could be used in any STIX message. I agree it is easier to see in UML. So it is easier to get agreement on the content.

 

Once again, we are trying to solve a problem that is not there.  Using the same name for a field in a different idiom, is not an issue.  Higher level code will easily handle this and vendors and developers map those data fields in to their own dataset and that do something with them. Namespaces allows for people to artificially extent a schema and do things that will BREAK compatibility. 

[cbc] Interesting assertion. I don’t see how namespaces allow people to break interoperability. Namespaces provide for interoperability. My guess is you are postulating externally introduced namespaces? It is up to the policy of the specification as to the extensibility of new namespaces. I would suggest that some (controlled) extensibility is required for agility. But that is a choice independent of namespaces. CTI could forbid any new namespaces.

 

4: Variability

Bret: I am only concerned with a specific and very structured exchange schema.

Cory: There will be multiple patterns of exchange for different use cases based on the same underlying model.

 

Once again I disagree.  It is just as easy for me fill out every field and send the blob of data as it is to only fill out one-three fields and send it.  I am not only concerned with sending minimal data.  If I send several blobs of data, some TTPs, some ThreatActors, some Indicators.  Receiving code can easily handle this by saying:

if type = "indicator" do foo

elsif type = "ttp" do foo1

elsif type = "threatactor" do foo2

etc

One group may only be able to send indicators with certain data, and other vendors may be able to send something else.  Great, my code will consume and do things with all of it.

[cbc] And ignore what it doesn’t need, right? So what you are saying is that there is one large schema, no idioms and everything is optional? You may want to layer some required interaction profiles on top of that. In any case, CTI and the expectations of using it will change over time – better to plan for it.

 

5: Development

Bret: All I need is a text editor and I will type in my implementation.

Cory: Reading, writing, mapping and even presenting the data will be heavily assisted with automation. Only special algorithms will be coded.

 

This is a problem that vendors will solve.  This is not a standards track issue.  Vendors will produce neat and interesting tools that make use of the data.  The vendors that do the best job, will make the most money and get the most sales. 

[cbc] What you are suggesting is disenfranchising a large set of vendors that do not implement the way you do, it is up to the standard to provide the artifacts to enable a large community, not pre-suppose particular implementation styles, idioms and use cases.

 

To answer your question.  I am not against a solid UML specification or model or what every you call it.  In my mind a UML model is such a wonderful thing to have.  It makes it so much easier to learn and understand STIX.  When I first started playing with STIX, I build my own UML model as there wasn't one.  I needed to do that to make heads and tails of what was going on.  So yes, we need a UML specification / model.  

 

Where I believe we fundamentally disagree is on the idea of code writing itself and auto-generating itself.  

[cbc] So you work in machine code, no compilers? No virtual machines? No code gen from schema? No visualization tools? No analytics engines? No mapping tools. How are things in the 60’s?

So how ‘bout his? We agree on a small subset model and a JSON representation of it. We then see if that can be generated, if so, there should be no issue.

 

Some people may use this, but this is NOT a requirement for the standard, IMHO.  

[cbc] Again, it is for a standard that enables a larger community.

 

A nice and clean UML specification

[cbc] Ok, lets start on that now and stop spending so much time on one of multiple syntaxes.

 

and a super easy to implement binding in JSON is all we need at this point..  

[cbc] I really want you to have that as well!

Long-term I see the need for moving to a binary representation in say Cap-n-Proto, but that will be 3-5 years from now if we are successful.  

 

 

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

On Nov 30, 2015, at 16:35, Cory Casanave <cory-c@modeldriven.com> wrote:

 

Bret,

Lets start with points of assumed agreement:

1: We are specifying the fields and their data, exactly and in great detail in the spec

So there is a specification in great detail. 

 

2: we specify things out and give things field names that map back to a specification or as I believe you call it a model

There is a consistent and precise way to “link back” to the specification from instance data.

 

3: The world will not spin out of control either way, our choices will impact the time, cost and prevalence of adoption.

 

Now lets explore the possible differences, my understanding is:

 

1: Definition method

Bret: The specification is English prose.

Cory: The specification is a machine readable model that includes English prose.

 

My experience with interoperability specifications and standards (of which I have worked on several) is that it is very difficult and achieving true interoperability between independently developed systems, without point-point testing, is the exception rather than the rule. English prose is not well suited to the task and misunderstandings and errors are common. The authors and those who produced the specification assume much more than the newbe reader will understand or even agree with. Models and even extended semantics provide more precision and that precision. What is then very important is that implementations can be TESTED against their specification – this provides for interoperability.

Secondly, when the analysis is done from a model you get a cleaner and more easily understood result. Every time I have seen it. This is because you start looking at the problem domain, not the syntax.

 

 

2: Schema production

Bret: The field names and structure are hand crafted.

Cory: The field names and structure are produced from the model.

 

What makes something easy to implement is consistency, hand crafted schema tend to have inconsistencies that cost (more code to type) and introduce errors. Where there are multiple serialization formats it is much easier to maintain the single source of the truth in the model and gen the schema. Models produced as an afterthought add little.

 

3: Namespaces

Bret: The tag names in the data are implicitly mapped to the schema by name

Cory: The tag names are explicitly mapped to their schema and definition by name and explicit namespace

 

Name conflict is a problem with any large schema. STIX is large and composed from multiple sources that will have duplicate names. The chance core error it high without namespace references. Of course any automated system has to know how to make this trace.

 

4: Variability

Bret: I am only concerned with a specific and very structured exchange schema.

Cory: There will be multiple patterns of exchange for different use cases based on the same underlying model.

 

The simple example is not representative – there are several thousand terms in STIX. Different use cases will combine these in interesting ways and may or may not combine referenced elements in a single message. A very static structure prevents the kind of agility a community like this needs.

 

5: Development

Bret: All I need is a text editor and I will type in my implementation.

Cory: Reading, writing, mapping and even presenting the data will be heavily assisted with automation. Only special algorithms will be coded.

 

Some of us will be using automation and in some cases dynamically processing data purely based on its runtime specification. Having models and references works just fine for hand coded solutions (I suggest it is cheaper to implement due to the above). A text document with tags produces data that is unusable by such automated systems. Do we want to enable or disable the automation and semantic community?

 

So my question back to you: Why the fear of having a model? Is the “earth is going to spin out of alignment.  ” by taking a proven, standards based path of modeling our data as a basis for interoperability? My assertion is that it will be faster to get to a quality specification, easier to understand and cheaper to implement – all ingredients for wide scale adoption. Even if you don’t agree with this, it should be mostly harmless.

 

-Cory

 

From: Jordan, Bret [mailto:bret.jordan@bluecoat.com] 
Sent: Monday, November 30, 2015 5:56 PM
To: Cory Casanave
Cc: Jason Keirstead; Richard Struse; cti-stix@lists.oasis-open.org; Wunder, John A.
Subject: Re: [cti-stix] STIX: Messaging Standard vs. Document Standard

 

Cory,

 

I fundamentally do not understand this constant view of not understanding the data without a model...  I can see, if we allow people to arbitrarily create any type of document, say HTML pages, then yes, you need something to understand the data programatically.  But we are not doing that.  We are specifying the fields and their data, exactly and in great detail in the spec.

 

Further, if we specify things out and give things field names that map back to a specification or as I believe you call it a model, then how is it that people will NOT understand the data I send them?  

 

I just do not get it.  If I do something like, assuming all of these fields are defined in the spec:

 

{

  "type": "indicator",

  "related_ttp": "foo-bar-ttp-1234-12345-12345-1234",

  "badness": "true",

  "ip4_address": "8.1.2.3"

}

 

How is that not clear as to what I am saying?  All of the fields are defined in the spec and the spec calls out what they mean.  Call it a model an ontology or what ever.  How is it not clear?  Can they not related this indicator with the TTP?  Can they not understand this programmatically and do something with it?  How is this broken?

 

I am tired of the circular debates and assertions that some how we are missing something and the sky is going to fall and the earth is going to spin out of alignment.  

Thanks,

 

Bret

 

 

 

Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO

Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050

"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

 

On Nov 30, 2015, at 14:03, Cory Casanave <cory-c@modeldriven.com> wrote:

 

Json,

One response then I will shut up (don’t like the long threads!)

 

Re: Using a complex semantic model does not fulfil a messaging use case anymore than sending ODF documents over a wire

We absolutely agree, we don’t want a complex semantic model or ODF document. We want a simple machine readable artifact that reflects and defines the terms and concepts of Cyber information.

 

Re: You are falsely assuming the "newbe" is always going to be interested in the CTI fundamentals, vs. simply trying to add messaging around something

This is where we just don’t connect. Somewhere the precise meaning of those messages will be defined. I call that thing the model. A message syntax without knowing what it means is just blabber and the communications are not interoperable. I’m not sure what you mean by “CTI fundamentals” – I’m talking about how to interpret the message.

 

Re: it fundamentally can't, and not just because of complexity, it is because the use cases for the data are intrinsically mis-aligned with each other.

Exactly, if the sender and receiver of a “document” don’t know what it means they are “intrinsically mis-aligned with each other”.

Also, use cases are great for discovering patterns of interaction, you can’t assume that this set is complete – newbe will have their own use cases, they JUST NEED TO UNDERSTAND YOUR DATA. Even with the same use case, interoperability happens when you understand the message of another.

 

We don’t need to make this complex, just clear, consistent and machine readable.

 

From: cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jason Keirstead
Sent: Monday, November 30, 2015 3:40 PM
To: Cory Casanave
Cc: Jordan, Bret; Richard Struse; cti-stix@lists.oasis-open.org; Wunder, John A.
Subject: RE: [cti-stix] STIX: Messaging Standard vs. Document Standard

 

This is exactly my point. You are falsely assuming the "newbe" is always going to be interested in the CTI fundamentals, vs. simply trying to add messaging around something (ie. sightings) to an existing product (that actually may or may not directly relate to security).

Many products need to "speak STIX" without being concerned with the model. It doesn't have to do with being "confusing" or "simple", it has to do with fulfilling the messaging use case. Using a complex semantic model does not fulfil a messaging use case anymore than sending ODF documents over a wire could be used to fill an IM use case - it fundamentally can't, and not just because of complexity, it is because the use cases for the data are intrinsically mis-aligned with each other.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown 


<image001.gif>Cory Casanave ---11/30/2015 04:14:41 PM---Re: There is no actual reason that indicator or sighting messages need to be a layer on top of the o

From: Cory Casanave <cory-c@modeldriven.com>
To: "Jordan, Bret" <bret.jordan@bluecoat.com>, Jason Keirstead/CanEast/IBM@IBMCA
Cc: Richard Struse <Richard.Struse@HQ.DHS.GOV>, "cti-stix@lists.oasis-open.org" <cti-stix@lists.oasis-open.org>, "Wunder, John A." <jwunder@mitre.org>
Date: 11/30/2015 04:14 PM
Subject: RE: [cti-stix] STIX: Messaging Standard vs. Document Standard
Sent by: <cti-stix@lists.oasis-open.org>





Re: There is no actual reason that indicator or sighting messages need to be a layer on top of the ontology. 

Think of the poor “newbe” coming to CTI as part of “widespread adoption”. This newbe may have a very different use case from what a few people on this list had in mind, this is their added value and reason for playing. They don’t know about the shortcuts that were made or why.

If the model is confusing, wrong, incomplete or just weird from their perspective, implementation will be costly and error prone. Brutal consistency and a clear relationship between the domain concepts in the model and the data schema will help reduce time and costs of producing an interoperable implementation and validating it, resulting in wide scale adoption. My concern is that “simple” is being interpreted for existing STIX experts, a very different group from our newbe.

You want wide adoption? Getting the model right.

From: cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jordan, Bret
Sent:
 Monday, November 30, 2015 12:35 PM
To:
 Jason Keirstead
Cc:
 Richard Struse; cti-stix@lists.oasis-open.org; Wunder, John A.
Subject:
 Re: [cti-stix] STIX: Messaging Standard vs. Document Standard


I agree with Jason. 

Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."

On Nov 30, 2015, at 08:35, Jason Keirstead <Jason.Keirstead@ca.ibm.com> wrote:

Precisely.

If we can agree on the below.. then work on the standardization of messages can be done independently of the underlying model.

RE @Sean:
 However, I do not view these message specifications as an alternative or independent thing from the model/ontology. I would view them as a layer on top of the model/ontology that allows focused and explicit representation of a small subset of information from the model/ontology that is relevant for a given exchange use case.

I disagree here - this is why we are having such a hard time with the current paradigm. 

There is no actual reason that indicator or sighting messages need to be a layer on top of the ontology. They are for totally different use cases and can be developed completely independently.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown 


<graycol.gif>"Struse, Richard" ---11/30/2015 11:04:55 AM---So, what I think I’m hearing is that we envision a world where we define a serialization for STIX &

From: 
"Struse, Richard" <Richard.Struse@HQ.DHS.GOV>
To: 
Jason Keirstead/CanEast/IBM@IBMCA, "Wunder, John A." <jwunder@mitre.org>
Cc: 
"cti-stix@lists.oasis-open.org" <cti-stix@lists.oasis-open.org>
Date: 
11/30/2015 11:04 AM
Subject: 
RE: [cti-stix] STIX: Messaging Standard vs. Document Standard






So, what I think I’m hearing is that we envision a world where we define a serialization for STIX & CybOX (let’s assume in JSON) and implementations can exchange “documents” using the serialization of the complete data model (e.g. for communicating a new TTP for an existing threat actor). However, in addition to this, we might define/standardized specialized message exchanges for a set of common use-cases such as indicator or indicator-sighting exchange. This would allow appliances, for example, to simply implement the use-case-specific message exchanges that make sense without having to implement the full STIX model.


As a result, I foresee implementations asserting what exchanges they support, perhaps as follows:

CTI-O-MATIC Threat Analysis Platform
STIX Exchange: SUPPPORTED
Indicator Exchange: SUPPORTED
Indicator-Sighting Exchange: SUPPORTED
Etc.


ACME IDS 9000 Appliance
STIX Exchange: NOT SUPPORTED
Indicator Exchange: SUPPORTED
Indicator-Sighting Exchange: SUPPORTED


Does this make sense?

From:
 cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jason Keirstead
Sent:
 Monday, November 30, 2015 9:47 AM
To:
 Wunder, John A.
Cc:
 
cti-stix@lists.oasis-open.org
Subject:
 Re: [cti-stix] STIX: Messaging Standard vs. Document Standard
"What about a new TTP for an existing threat actor? I would not want to have to do an RDF-based exchange to share that type of information (still holding out hope for a reasonable JSON-LD approach) but I’m also not sure we can build messages to cover those use cases."

I believe you would indeed do a complex exchange for that. This is not a "messaging" use case, it is a "document share" use case. The difference in complexity between sharing TTP information to sighting information is similar to emailing a word document vs. engaging in an IM session. It's not the same.

My point is that the huge amount of third party vendors who want to "speak STIX" to communicate and/or absorb indicators, observables, and sightings, are not interested in use cases like "TTP for an existing threat actor". They don't have that information, and they can't act on that information. You aren't going to get TTP information out of an IPS, and you aren't going to send TTP information to an IDS or Firewall. But you will get Indicators and sightings from an IPS, and you will want to send observables to an IDS or Firewall.

These are the two different use cases - one that lends itself to a semantic model, and one that lends itself to a compact and coherent messaging format.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems

www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown 


<graycol.gif>
"Wunder, John A." ---11/30/2015 10:04:36 AM---So to be honest I’m not yet as convinced on this approach as all of you (sorry). I can definitely se

From: 
"Wunder, John A." <jwunder@mitre.org>
To: 
"
cti-stix@lists.oasis-open.org" <cti-stix@lists.oasis-open.org>
Date: 
11/30/2015 10:04 AM
Subject: 
Re: [cti-stix] STIX: Messaging Standard vs. Document Standard
Sent by: 
<
cti-stix@lists.oasis-open.org>







So to be honest I’m not yet as convinced on this approach as all of you (sorry). I can definitely see the value of messages at the level of sightings and indicators but it seems to me like there’s a giant middle ground of use cases where we don’t want to define tightly-scoped messages but the document-based approach would still be a burden. For these cases I was hoping the JSON serialization of the full model would be used. 

For example, would we have a message to represent a new incident? What would the message semantics be? What about a new TTP for an existing threat actor? I would not want to have to do an RDF-based exchange to share that type of information (still holding out hope for a reasonable JSON-LD approach) but I’m also not sure we can build messages to cover those use cases.

Jason, Jon, Mark…what do you all think about that? Would we define messages for that? Would we have third-party messages (i.e. my app can define a non-standard CTI message based on the data model)? Would we just use RDF?

John

On Nov 30, 2015, at 8:42 AM, Jason Keirstead <Jason.Keirstead@CA.IBM.COM> wrote:
+1 to all below recommendations... exactly my line of thinking.

It may or may not be more work to undertake these two parallel efforts - but I believe that it would allow both efforts to more forward in a faster and more coherent way than the current methodology.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems

www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown 


<graycol.gif>"Baker, Jon" ---11/30/2015 09:36:44 AM---+1 Thanks for thinking through the underlying issues that might be making it so hard to achieve cons


From: 
"Baker, Jon" <bakerj@mitre.org>
To: 
Jason Keirstead/CanEast/IBM@IBMCA, "
cti-stix@lists.oasis-open.org" <cti-stix@lists.oasis-open.org>
Date: 
11/30/2015 09:36 AM
Subject: 
RE: [cti-stix] STIX: Messaging Standard vs. Document Standard
Sent by: 
<
cti-stix@lists.oasis-open.org>







+1

Thanks for thinking through the underlying issues that might be making it so hard to achieve consensus. I completely agree that by trying to develop a messaging standard and a document standard in one effort is a significant source of frustration for this group. This is how I have thought about this issue:


STIX has two primary use cases

 UC1: Holistic cyber threat analysis
 
UC2: Exchange cyber threat information

Requirements for UC1 are not always conducive to effective information exchange

My basic recommendation would be as follows:

Differentiate analysis and sharing requirements 

 avoid overloading analysis model with exchange requirements
 
avoid overloading exchange with analysis requirements

Develop a high level model of cyber threat intelligence for analysis

 initially in UML, but a semantic representation can be developed

Develop messages tailored to information exchange needs

 each exchange has a formal specification
 
ensure messages are compatible with the analysis model
 
allow protocol and serialization to be dictated by information exchange needs
 
initially specify only a few well known and well defined messages
 
plan for many messages, but add messages over time as real needs are understood


Thanks,

Jon

============================================
Jonathan O. Baker
J83D - Cyber Security Partnerships, Sharing, and Automation
The MITRE Corporation
Email: 
bakerj@mitre.org

From:
 cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jason Keirstead
Sent:
 Thursday, November 26, 2015 8:47 AM
To:
 
cti-stix@lists.oasis-open.org
Subject:
 [cti-stix] STIX: Messaging Standard vs. Document Standard
When I originally started this message, I had started it with a "here is why I am against JSON-LD" stance, but then decided to take a step FAR BACK and try to figure out / tease apart the fundamental reasons why people are both for and against JSON-LD. As a result of my analysis, I think am starting to figure out why there are two diametrically opposed camps here.

The root I believe is that there is a fundamental disconnect between an ideal messaging standard and a document standard, yet STIX is trying to serve both masters. I am not sure that it can, and keep everyone happy. At any rate, I hope if everyone can read through the below, it will at least help each camp start to see the other's point of view.

Things desired in a document standard:

- Clarity of the source and meaning of the data
- Readability by humans can sometimes be a factor depending on use cases
- Byte-efficiency is a secondary or tertiary concern (disk is cheap)

In a document standard, it is now the standard practice that the schema accompanies the document. This is the core tenant of JSON-LD and other related semantic technologies - that your data is annotated in a way such that it can be linked back to the schema that defined it, which then also allows you to infer the semantic meaning behind fields in the document. This lets people and systems cross-correlate and search documents of different types that contain fields that are related semantically, without having to have standard-specific code written for them.

Things desired in a messaging standard:

- Maximum byte efficiency (bandwidth is not cheap)
- Absolutely zero ambiguity
- Readability by humans is a secondary (or tertiary) concern, sometimes not a concern at all

In a messaging standard,the schema has no reason to accompany the message, because anyone who implements it would have zero ambiguity anyway, and doing so greatly inflates the size of the messages. You also don't have to infer meaning of a field in a messaging standard, because the meaning is fixed and is not open to any interpretation. As such, semantic technologies are not required in a messaging standard, because they aren't even applicable to the use case.

The root of our problem here and I believe why we can not come to consensus, is we are trying to come up with one standard that does both things, which are actually philosophically opposed to each-other. There is an extremely large community of people and systems who want to "speak STIX", but they have no plans to STORE STIX, and this could not care less about semantic representations. Similarly, there is a large community of people and systems who want to (and already have) systems with large STIX warehouses, and very much care about semantic representations, so that they can tie that data to other systems.

Maybe we should take a step back and look at this more critically. If you look at what people care about from a "frequently messaged" perspective (namely of indicators and observable occurrences) maybe that should be moved under TAXII? Currently, TAXII is just a transit protocol and the standard of the messages is simply " a STIX document". I am starting to think that this is not enough and it's part of why we can't reach any consensus. There is no reason that there could not be a messaging format in TAXII to communicate indicators and observables that was an offshoot of STIX but not STIX itself... meanwhile there could continue to be a channel for full/complete "STIX documents" which are transmitted with much less frequency. 
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems

www.ibm.com/security |www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]