RE: [cti-users] Indicator Type / Vocabulary Implementation Questions

Yes Great post. (Apologies catching up!)

Terry MacDonald

Senior STIX Subject Matter Expert

SOLTRA | An FS-ISAC and DTCC Company

From: cti-users@lists.oasis-open.org [mailto:cti-users@lists.oasis-open.org] On Behalf Of Barnum, Sean D.
Sent: Saturday, 24 October 2015 2:41 AM
To: Cory Casanave <cory-c@modeldriven.com>; Patrick Maroney <Pmaroney@Specere.org>; jason.keirstead@ca.ibm.com; Grobauer, Bernd <bernd.grobauer@siemens.com>; Wunder, John A. <jwunder@mitre.org>; cliff.palmer@gd-ms.com
Cc: cti-users@lists.oasis-open.org
Subject: Re: [cti-users] Indicator Type / Vocabulary Implementation Questions

Great post.

Thanks Cory.

sean

From: <cti-users@lists.oasis-open.org> on behalf of Cory Casanave <cory-c@modeldriven.com>
Date: Friday, October 23, 2015 at 11:29 AM
To: Patrick Maroney <Pmaroney@Specere.org>, "jason.keirstead@ca.ibm.com" <jason.keirstead@ca.ibm.com>, Bernd Grobauer <bernd.grobauer@siemens.com>, John Wunder <jwunder@mitre.org>, "cliff.palmer@gd-ms.com" <cliff.palmer@gd-ms.com>
Cc: "cti-users@lists.oasis-open.org" <cti-users@lists.oasis-open.org>
Subject: RE: [cti-users] Indicator Type / Vocabulary Implementation Questions

Patrick,

Great perspective – it is a common and difficult problem to balance scope, complexity, extensibility and simplicity. The inconvenient truth is that Cyber security is not simple, but we don’t want to introduce arbitrary complexity either.

Well, I’m probably putting a target on my back, but I’m going to suggest that some things that may seem simple do, in fact, introduce more complexity and restriction as reality sets in. (remember when XML was the simple alternative, then we got XML Schema and 100 extensions – same for java).

I have noticed a set of recurring themes that if we could deal with in a consistent way, may help this difficult balance. So just consider these as some thoughts from the peanut gallery.

· Big Vs. complicated.

A very small and simple instance document may be described by a large schema. A large schema in and of its self is not necessarily complex. What is complex is:

o When you have to insert a lot of “stuff” the simple case doesn’t need. Some of this may be inevitable, but it can be mitigated by good design.

o When there are many ways to say the same thing. This is unclear semantics and/or redundant elements.

o When it is not clear how to do the simple thing. This can be mitigated with use case specific documentation (like the “idioms”)

o When large schema/vocabularies/models are not modular so you can look at manageable “chunks”.

· String encoding.

A very common “simplification” is to substitute a reference to a thing with a name or ID usually a string. E.g. victim: ”Joe Smith”. While this seems simple it causes multiple problems that result in downstream complexity. A reference to a thing should always be a type for that thing. E.g. victim: -> person: {name:”Joe Smith”}. Reasons for this are:

o What is in the “string” is unclear and tends to be inconsistent, making interoperability difficult.

o When other “facts” need be said about the thing, you have a place to put them and don’t try and encode it in complex strings.

o If the string is an ID, there is no consistency for the basis for the identifier.

· My hierarchy is not your hierarchy.

While it seems simple to put things in a hierarchy, hierarchies tend to be specific to a use case or perspective. Independent entities should be independent and referenced, not embedded as an attribute. The complexity introduced by some implementation frameworks for references can be a problem – but can be mitigated by a good framework/API. Problems with embedding include:

o The same entity may show up in multiple places, resulting in confusing redundant data.

o Embedded entities may lack identity, making analysis difficult.

o It is hard to “say more about” such an embedded entity or to reference it later.

· There is more to say

When you try and put everything someone may need to know in a single “data package”, the packages become large and very coupled to a single perspective of what must or can be communicated. Having an architecture that allows for accessing additional data – be it extended vocabularies or more information on Joe Smith allows the structures to be smaller, simpler and less coupled. Building the expectation of linking into the architecture allows for this simplicity, which also provides for extensibility. This can also be done in “secure” or partially connected communities by building linking into security and using intermediate data cashes.

o Note: My view is that this is solved: every reference should be a URI, how you “dereference” that URI is where your secure boundaries come in.

· Realistic extension

Some of our technologies fight extensibility and ad-hoc mechanisms are to be built-in to support it, sometimes these mechanisms add complexity. On the other hand, assuming there will never be a need for extension is unrealistic in an open community. Sometimes it is better to:

o Use a technology that allows for extension naturally

o Be realistic about where extensibility is really needed. When it is – use the extensibility mechanisms for everything. E.g. all vocabularies are an “extension”, even if some are considered well known and curated.

· It is obvious what it means

Much design and implementation experiences come from closed or smaller environments than CTI. If, for example, we are integrating 3 systems in company XYZ there is a shared understanding and culture of the team. What things mean, constraints and formats tend to be worked out dynamically – and this is practical in such an environment. In a large and dynamic community it is simply amazing how different people interpret the same things. In a community environment there is a need for clear and precise semantics and, wherever possible, automated validation. Loose specifications will never be interoperable. If not interoperable, what is the point?

· This is the only technology we will need

Whatever it is, there is a favorite of the day and a desire to “just do that”. Well, not everyone will have the same favorites and the style of the day changes (sometimes it seems like we are the fashion industry). Over-committing to a single technology builds that technologies limitations into your solution – which may look really stupid in 5 years. Separation of concerns is vital.

· Real complexity

As a final thought – recognize “real complexity” – if what you are trying to do and communicate is not simple, don’t expect a simple result. The challenge is recognizing and supporting the real complexity in as simple a way as possible. If complexity is being introduced for other than “real” reasons, what is introducing them?

This got longer than intended, sorry about that! If we can find our way to deal with these issues in a consistent way with good design and a supporting implementation framework that makes them easy to deal with we can have a usable balance between simplicity, scope and extensibility.

-Cory Casanave

From: cti-users@lists.oasis-open.org [mailto:cti-users@lists.oasis-open.org] On Behalf Of Patrick Maroney
Sent: Friday, October 23, 2015 9:03 AM
To: jason.keirstead@ca.ibm.com; Grobauer, Bernd; jwunder@mitre.org; cliff.palmer@gd-ms.com
Cc: cti-users@lists.oasis-open.org
Subject: RE: [cti-users] Indicator Type / Vocabulary Implementation Questions

There is a common theme running in our important discourse. It is important for Vendors and those focusing on narrow Use Cases to understand the complexity of APT Intrusions. We understand and *completely* support many of the arguments made for simplicity and easily addressing narrowly focused use cases. If I just need to send a list of IP Address, Domains, etc. to security appliance, then an efficient/consistent/simplified mechanism is absolutely a key CTI requirement.

Those of us who have been dealing with the full scope of APT Targeting, Compromise, Lateral Movement, Entrenchment, repeated Mapping/Exploitation of Victim Organizations and Infrastructure, etc. for many many years (pre Titan Rain) are also key stakeholders in "Our Thing".

We are not pushing for "Complexity" just for complexities sake: we are pushing these higher dimensional representation concepts because this complexity is part of the reality we operate in on a daily basis. If we can't model all of the key elements of Adversary, TTP, and Target Domains, we can't change the Cyber Battle Space dynamics. Doing so globally, across sectors, in real-time, is the "Holy Grail" we seek. No one said this would be easy, but it is a much much better use of our collective energies in comparison to another decade of playing APT Whack-A-Mole and counting Body Bags.

Hopefully I'm not triggering a new wave of "Less Filling" <==> " Tastes Great" 😁

On to specifics (great discussion by the way)

(1) IncidentType is critical to some of the highest value CTI Use Cases, including mandatory Incident Reporting (in some cases, required under law in 72 hours after detection!). Automating, standardizing/normalizing, and aligning Incident Reporting across Stakeholder

https://www.us-cert.gov/incident-notification-guidelines

http://www.law.cornell.edu/cfr/text/32/236.5

http://www.dtic.mil/whs/directives/corres/pdf/520513p.pdf

http://www.acq.osd.mil/se/docs/DFARS-guide.pdf

(2) C2 is one of the highest value CTI Elements one can convey.

"One of the most common forms of indicator seen describes a pattern for TCP traffic beaconing to a specific command and control (C2, C&C) server. This idiom describes creating such an indicator in STIX."

The presence of attempted or active C2 is one of the strongest indicators of "Wildfire": Active compromise of an asset/network. If you see it you are pwned, only question is the degree.

C2 can be a component of all Kill Chain Phases (Lockheed Martin™) , from "Recon" to "Actions on Intent".

We could consider the "one way to do things" principle, but given the importance of semantically and temporally characterizing C2 in the Operation Domain, we would need to ensure we can clearly convey in all required contexts.

(3) Similar comments to C2 on "Exfil". Although we have a running joke in certain Domains ( "...but no evidence of Exfiltration"). "Exfil" is the "Game Over" state in CTI Operational Domains.

(4) "Malicious.CybOXObject"

It would be great to start focusing on a number of long standing enumerated type issues in CTI. In some cases they are non-sequitur with current realities, incomplete, bloated, etc. We should also be mindful of related standards and use these and/or map to their taxonomies. Some of the enumerations in other standards have similar issues with currency/relevance, but we should adopt these wherever possible and engage with those communities to fix them "in one place" (Jerome Athias has one of the best perspectives on these standards, where they intersect, conflict, etc.: Jerome would be a great resource to lead this effort)

Patrick Maroney
President
Integrated Networking Technologies, Inc.
Desk: (856)983-0001
Cell: (609)841-5104
Email: pmaroney@specere.org

_____________________________
From: Grobauer, Bernd <bernd.grobauer@siemens.com>
Sent: Friday, October 23, 2015 6:50 AM
Subject: RE: [cti-users] Indicator Type / Vocabulary Implementation Questions
To: <jwunder@mitre.org>, <jason.keirstead@ca.ibm.com>, <cliff.palmer@gd-ms.com>
Cc: <cti-users@lists.oasis-open.org>

Hi,

> I heard a recent proposal to remove it entirely. What would be the
> impact of that?

I had made the suggestion to remove the IncidentType entirely in
my somewhat provocative mail a few weeks ago, in which I wanted
to explore how much potential for simplification in going towards
STIX 2.0 there might be.

Why had I suggested to remove it?

The main reason is that I do not find the values that are currently part of the
standard vocabulary particularly useful:

- Why would I put 'IP Watchlist' or 'Domain Watchlist' or 'File Hash Watchlist'
into the Indicator Type? I could understand "Watchlist", which tells you
to watch for whatever Observable Patterns are indicated in the indicator.

- Another type is 'C2' -- at the same time I have the ability to reference
in the indicator a kill chain phase ... and if the referenced kill chain
is of any use, it will have something corresponding to 'C2'.

Now I have (again) two ways of expressing the same thing ... we have
just stumbled over this issue a few days ago in a sharing group we
are part of: we use the reference to the killchain phase to indicate
C2-activity, others use the indicator type.

Similarly, "Exfiltration" -- should that not be described with a reference
from the indicator to an TTP "Exfiltration"?

Other entries in the standard vocabulary ("Malicious Email", "Host Characteristics")
seem like there would be no end to the list of allowed vocabulary (think
"Malicious <enter CybOX object type here>" as pattern for generating vocabulary...)

My suggestion to get rid of the indicator type was really a bit of a calculated
provocation -- I have no trouble with keeping it in STIX. But we should
ensure that the standard vocabulary is defined such that it really adds
value rather than adding confusion by allowing yet more ways to describe
the same thing in different ways.

Kind regards,

Bernd

----------------

Bernd Grobauer, Siemens CERT

cti-users message