Re: [Non-DoD Source] Re: [cti] Re: [EXT] Re: [cti] type changing from "o

Yes!!

The use cases Bret and Jeff outline here are only a few of dozens to hundreds of such use cases that demonstrate a need for observables to be full objects.

>>[terry] By tying an identifier to each observable, we effectively make each observable unique. A consumer doesn't want unique observables, but instead wants to use the observables as pivot points that they can use to relate other parts of Intel to.

I would suggest that the reality is actually the inverse of this statement. Tying an identifier to each observable makes each specification of an observable as a STIX object unique but it makes it possible for relationship connections to and from that object to be non-unique as they can use that same identifier.

By placing observables within a separate observed_data object with localized identifiers, we “make each observable unique” to a specific observed_data instance and unusable for reuse/relationship outside of that instance. A thousand observation instances of a single observable require a thousand unique specifications of the observable (one within each observed_data) within the thousand observation specifications whereas observables as full objects enables the specification of a single observable instance and a thousand observations that reference that one instance.

Observables embedded within a localizing object rather than being full objects prevents the simplicity of the exact pivoting (across observed_data instances, across bundles, across systems, across orgs, across sharing communitiies) asserted here as the goal. Bret’s post below gives a fairly simple illustration of this.

>>[terry] For example if org A sends out a malware object describing that a domain name is used by malware, and org B sends out a domain name of a C2 server as being part of an incident, then a consumer TIP will extract the domain name and use that domain name as a node to relate those the malware and incident together. If we then add an additional layer of object above that (observables as SDOs), then what do we actually gain?

What you gain is enormous gains in simplicity above and beyond the obvious advantage of “one way to do things” for objects. For one thing, you don’t have to “extract” the observable out of another object and create a new node (uniquely localized to that system and non-referenceable by anything else) and set of relationships to the other intel content because the node and the relationships already exist. All you would have to do is simply ingest the content and then assert a single relationship that the domain name observable node from Org A is same-as/equivalent-to the domain name observable node from Org B. Most orgs would want to keep this full set of info in their systems to maintain a contextual picture of provenance. Any organization who wants their repository as clean and sparse as possible could easily run a background service that looks for same-as/equivalent-to, keeps a single instance, removes duplicates and adjusts relationships accordingly. Additionally, if this same domain name showed up tens, hundreds, thousands of more times handling the situation is just as simple as with two instances and does not require complex extraction and new node and relationship generation. There are numerous variations of use cases that derive from this simple one (whitelists, persistent object publishing within orgs & communities, etc.) that all are vastly simpler when observables are full objects.

>>[Jason] The problem is you are confusing the notion of an observable piece of data, and an observation of that data. Anyone who is storing the contents of observed_data as nodes in a graph, is not properly modeling the data because they are treating the observable as the observation, when they shouldn't be.

I fully agree with Jason that the heart of the matter here is a conflation of observables and observations.

That said, I would strongly disagree with his conclusions from this fact.

Observations are factual statements that some particular observables were seen at a particular time. They are concrete, non-abstract and bound in time.

Observables are characteristic descriptions of things that might be observed in a cyber context. They are abstract, non-concrete and not bound in time. This means that a given Observable object may characterize an observable that is useful and relevant across numerous contexts. It might be seen in numerous observations, it may be used to characterize particular infrastructure, to characterize particular malware class structure or behavior, serve as a basis for indicator specification, etc.

This sort of usefulness in relation to multiple other objects should be one of the primary factors in deciding if something should be a full object or not. Observables are a poster child for this sort of decision.

Constraining the specification and reference of abstract, reusable/pivotable information completely within other concrete, non-abstract structures limits/removes its utility. You lose most of its potential value and efficiency.

>>[Jason] Having UUIDs for every single piece of content in observed_data would serve no purpose at all.

On the contrary, having observables as full objects that can be specified and referenced independently is necessary to realize the vast majority of their potential as explained above.

None of this explanation and reasoning is new. All of this was presented and argued at length in the past.

I would suggest that all of it is as true and valid today as it was back then.

We have avoided proactively bringing this issue (what we consider the single most significant issue in STIX) up again to avoid drama.

I am only restating it here now as others have initiated the conversation and made assertions that we would strongly disagree or agree with.

We are hopeful that some new voices/perspectives in the TC along with the experiences of TC members trying to implement this in their orgs and trying to work through related issues within STIX over the past 1+ years may have evolved the aggregate understanding of this issue within the TC.

As I stated in a previous email, we firmly believe that this will have to be revisited and revised at some point such that observables are full objects.

We would love such a decision to be made sooner rather than later for all our sakes.

Sean Barnum

Principal Architect

FireEye

M: 703.473.8262

E: sean.barnum@fireeye.com

From: Bret Jordan <Bret_Jordan@symantec.com>
Date: Wednesday, October 4, 2017 at 1:28 AM
To: "Mates, Jeffrey CIV DC3\DCCI" <Jeffrey.Mates@dc3.mil>, Terry MacDonald <terry.macdonald@cosive.com>, Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Cc: CTI TC Discussion List <cti@lists.oasis-open.org>, Andras Iklody <andras.iklody@circl.lu>, Sean Barnum <sean.barnum@FireEye.com>
Subject: Re: [Non-DoD Source] Re: [cti] Re: [EXT] Re: [cti] type changing from "object" to "array" for cyber observable objects

Good point Jeff.. I was also thinking of the idea that I find that Domain example.com maps to 1.2.3.4. So I issue an Observed_Data blob with the Observables nested together in their dictionary. Then next week you find that example.com is now using 2.3.4.5, so you issue your own Observed_Data object. Then Sarah find that example.com maps to 3.4.5.6 next month and she emits that Observed_Data blob.

So now in my graph, what ever am I do to with this STIX objects. I want to links a SUB element of the Observed Data blob to other SUB elements of different Observed Data blobs.

The only way I can see how to make this work is to throw away all of the original produced objects and just keep their insides and make my own new object. But then when I want my Threat Actor to point to just Example.com or just to IP address 2.3.4.5, then I guess I need to break the Observed Data object apart once more. Then how do I store it... So should I store Example.com in one Observed Data object and the IPs that it has been known to use in a different one? Then how do I relate them together? Also, how do I say that those three IP addresses were only used during these specific times with that example.com????

Bret

From: Mates, Jeffrey CIV DC3\DCCI <Jeffrey.Mates@dc3.mil>
Sent: Tuesday, October 3, 2017 3:15:28 PM
To: Terry MacDonald; Jason Keirstead
Cc: CTI TC Discussion List; Bret Jordan; Andras Iklody; Sean Barnum
Subject: RE: [Non-DoD Source] Re: [cti] Re: [EXT] Re: [cti] type changing from "object" to "array" for cyber observable objects

One of the challenges with the current multilayered model is knowing exactly which object to use and how to break up the components of each object. For example, let’s say I sandbox some malware and observe it beaconing to two domains that I determine resolve to a single IP address. I have a few options to record this right now:

Group it Together

Put everything into a single “samples” block within a Malware object.

Split Out Infrastructure

Put the Malware’s file information in the “samples” block of a malware object
Put the each IP to Domain resolution in separate Infrastructure objects as part of their "observable_details”.

Split Out Infrastructure and Observation

Put the Malware’s file information in the “samples” block of a malware object
Put the each IP in a separate Infrastructure objects as part of their "observable_details”.
Put each resolution between a domain and IP into an Observed Data object in its “objects” field.

Do any mix of the above.

All of these are valid options, which can be deduplicated into the same final meaning, but it becomes a lot harder to do once we try to connect these items to other STIX content. For example if I were to say that a Threat Actor owned an IP I would need to use #3 because otherwise I would be incorrectly asserting a relationship with a domain or file.

Jeffrey Mates, Civ DC3/DCCI

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Computer Scientist

Defense Cyber Crime Institute

jeffrey.mates@dc3.mil

410-694-4335

From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Terry MacDonald
Sent: Tuesday, October 3, 2017 3:22 PM
To: Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Cc: CTI TC Discussion List <cti@lists.oasis-open.org>; Bret Jordan (CS) <Bret_Jordan@symantec.com>; Andras Iklody <andras.iklody@circl.lu>; Sean Barnum <sean.barnum@fireeye.com>
Subject: [Non-DoD Source] Re: [cti] Re: [EXT] Re: [cti] type changing from "object" to "array" for cyber observable objects

I would agree with Jason here. By tying an identifier to each observable, we effectively make each observable unique. A consumer doesn't want unique observables, but instead wants to use the observables as pivot points that they can use to relate other parts of Intel to.

For example if org A sends out a malware object describing that a domain name is used by malware, and org B sends out a domain name of a C2 server as being part of an incident, then a consumer TIP will extract the domain name and use that domain name as a node to relate those the malware and incident together.

If we then add an additional layer of object above that (observables as SDOs), then what do we actually gain? The TIP has identified a relationship between the malware and incident objects in my example, do what is the benefit of sending observables as SDOs? Versioning But what use is versioning to a domain name? It's just a domain name. It is what it is. Revoking? Why would I revoke a thing that is a fact? I would revoke an ObservedData object because it is an assertion - but an observable is just a piece of piece of data.

AFAICT adding Observables as top level objects just creates more objects and more work for the end customers TIP with no noticeable gain in functionality.

Cheers

Terry MacDonald

Cosive

On 4/10/2017 07:11, "Jason Keirstead" <Jason.Keirstead@ca.ibm.com> wrote:

The problem is you are confusing the notion of an observable piece of data, and an observation of that data.

Anyone who is storing the contents of observed_data as nodes in a graph, is not properly modeling the data because they are treating the observable as the observation, when they shouldn't be.

Having UUIDs for every single piece of content in observed_data would serve no purpose at all. The IP address "8.8.8.8" is already unique. It therefore needs no UUID because it is a key in and of itself. This is the big problem I always had with the concept, the idea that every single number, string, IP, host, and URL in existence turns into its own SDO is beyond ridiculous to me.

-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security

Without data, all you are is just another person with an opinion - Unknown

From: Bret Jordan <Bret_Jordan@symantec.com>

To: Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Cc: Andras Iklody <andras.iklody@circl.lu>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>, Sean Barnum <sean.barnum@FireEye.com>
Date: 10/03/2017 02:05 PM

Subject: Re: [cti] Re: [EXT] Re: [cti] type changing from "object" to "array" for cyber observable objects

Sent by: <cti@lists.oasis-open.org>

All of the implementations that I have seen so far, treat these as first order citizens and link them in the graph directly. The way we have it now is a bit weird. We basically have a two layered graph that does not allow you to cross link things together. Further, we have two radically different ways of structuring the content. The STIX object way with a type and ID field and the Observable way of a dictionary.

My other problem is with the complexity of the Observed Data object. If you question that, please look at the length of text we as editors had to write to try and explain it. It would be SO much easier if the cyber observables were just first order citizens.

Bret

From: Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Sent: Tuesday, October 3, 2017 6:15:52 AM
To: Bret Jordan
Cc: Andras Iklody; cti@lists.oasis-open.org; Sean Barnum
Subject: Re: [cti] Re: [EXT] Re: [cti] type changing from "object" to "array" for cyber observable objects

I still can not see the value in having an immutable fact have a UUID. It makes no logical sense to me.

-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security

Without data, all you are is just another person with an opinion - Unknown

From: Bret Jordan <Bret_Jordan@symantec.com>
To: Sean Barnum <sean.barnum@FireEye.com>
Cc: Andras Iklody <andras.iklody@circl.lu>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Date: 10/02/2017 08:08 PM
Subject: [cti] Re: [EXT] Re: [cti] type changing from "object" to "array" for cyber observable objects
Sent by: <cti@lists.oasis-open.org>

I was one of the ones that pushed against this. At the time I could not see the value of having observable objects be first order citizens. I admit that. But I am really beginning to question it. So much so, that I think we may have gotten it wrong.

Bret

Sent from my iPhone

On Sep 29, 2017, at 9:42 AM, Sean Barnum <sean.barnum@FireEye.com> wrote:

I will take this opportunity to restate our strong assertion that observables should stand on their own as full objects with UUID-based identifiers and all the other metadata of SDOs.
This opinion was very strongly held by many at the time of the debate but was overruled by a majority of others.
The decision to fold observables into the arglebargle and to get rid of CybOX thus losing their context independence was what led players in the digital forensic community to leave the CTI TC and begin work on CASE/UCO separately as they did not believe it possible to support the needs of cyber investigation without observables as independent objects.
We at FireEye understand that this is the way that voting memberships work and we accepted the decision and have continued to work within the CTI TC to make the most we can of the situation.
This acceptance does not mean we agree with the decision then or now, only that we accept it as the consensus will of the TC members who voted at the time.
FireEye’s own model that integrates across CTI, DFIR, security operations, vulnerability management, malware analysis, threat detection, threat prevention, orchestration, etc treats observables as full objects as we believe that it is absolutely necessary to do so for many reasons, some obvious and some less obvious. Our desire to support STIX for partners/customers who request it means that conversion from our model to STIX will require extensive custom extensions and will also likely be lossy and/or inefficient for real world iterative sharing due to observable objects not being full objects.
We believe that eventually the CTI TC will recognize the need for observables to be full objects but we have carefully avoided any attempts to press the issue prematurely and cause unnecessary drama.
I hope that this message does not cause unnecessary drama but figured this was a good time to simply restate our position given the comments from Cheolho and Andras combined with several recent Slack comments from Bret questioning whether we should reconsider our decision regarding observables as full objects.

Sean Barnum
Principal Architect
FireEye
M: 703.473.8262
E: sean.barnum@fireeye.com

On 9/29/17, 4:06 AM, "cti@lists.oasis-open.orgon behalf of Andras Iklody" <cti@lists.oasis-open.orgon behalf of andras.iklody@circl.lu> wrote:

Is the reasoning behind it explained anywhere? Whoever we've discussed
STIX 2.x so far with had their faces buried deeply in their palms
whenever they got to the part of the documentation that explained this
very concept.

Also, revising bad decisions, even if they were reached via concensus /
a previous debate can be healthy for a standard. Especially when the
only explanation we get each time we ask about this is "as thus has been
decideth" without any reasoning given.

Best regards,
Andras

On 29. sep. 2017 09:53, Trey Darley wrote:
On 29.09.2017 09:43:26, Andras Iklody wrote:
100% agreed! {"0":{}, "1":{}} is just ridiculous.

All -

Referring to STIX 2.0, Part 3, §2.5 "Observable Objects":

"Each key in the dictionary SHOULD be a non-negative monotonically
increasing integer, incrementing by 1 from a starting value of 0, and
represented as a string within the JSON MTI serialization. However,
implementers MAY elect to use an alternate key format if necessary."

As anyone participating in standards development work knows,
compromises are often necessary. The choice to standardize on a
monotonically increasing integer was a compromise following a lengthy
debate. Note, however, that this is a SHOULD. You're free to use
whatever you like as a key provided it's a valid JSON string.

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail. Follow this link to all your TCs in OASIS at:
https://clicktime.symantec.com/a/1/_a1HRn3Ks5mnBIdMQdg49Boz24ieDy4g-A_aSoSb1RE=?d=afesj0rpxoEcEKpscK4aDSOZVZ2lbQP0nzYQ88zZq320t6Zl-q45e4cZ9PP5KBIvZTo4jG9Rlw5ui-Z-HEB_BrFoaKV0xxdWofRdKPzYoatjPem5wqdVCbCy0QGMpn0BN9RX8TW7Y7K9GxoeBCwtTI1lNK8hBwAnEfEF505bXLc0cniNx7fjRR6QCHTHCDhfGaopo1PUPr5NtWKdOEsL39EEHq74WqUOMEtAOqS1OoCKAGcEPMRsaVbRNu-Z7kRQ-jmk_fpeIjPbYlWGt1RFXMzw4XXMQYN_Uup2pMZdRloEFr9-hednZPEK7nzmBybDAcNniDOag0RLyTBN8f1LVpN66XgVR1EC7PIDG-GXupPEM-_FvBKTu3pGFTAIRgtCRT4rfen7muaghV3pQ2EX-EaiYnETVDJNSimokmK-j17SBuSAqOWxdzwfHhh_Ogd9JeDGE2gP9L-vpxV2Ew_9E4L1G40eqmTolBwOPAXRuL3i2G0%3D&u=https%3A%2F%2Fwww.oasis-open.org%2Fapps%2Forg%2Fworkgroup%2Fportal%2Fmy_workgroups.php

This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

cti message