So we are getting some good thoughts coming in now, but we still seem to be going in circles on what direction we are heading next. Aharon - Do we have a 'next step' identified? Is there any interest in an in-person meeting?
-----Original Message-----
From:
cti@lists.oasis-open.org [
mailto:cti@lists.oasis-open.org] On Behalf Of Grobauer, Bernd
Sent: Wednesday, September 09, 2015 7:49 AM
To:
cti@lists.oasis-open.org;
cti-users@lists.oasis-open.orgSubject: RE: [cti] Thoughts on STIX and some of the other threads on this list
Hi,
this is a bit late, but there were several requests for broad feedback on the major issue of what the future should look like, so here goes ...
First, for the impatient, reader, the contents of what comes below in a nutshell:
I) If we change the binding from XML to something else
for the next major release, we need to be sure that
the effort of doing so does neither slow us
down to much nor distract too much attention from
what really are the major problems of dealing
with STIX/CybOX (the fact that we use XML
is *not* the issue that will decide between success
or failure). And currently, I must says,
I am not sure that we can take the time to
solve the major issues *and* change the
binding.
II) I agree with the list of problem points that
Aharon has sent around some days ago: these are
things we need to solve *first*.
III) To Aharon's list, I would add the problem of
embedded relationship info rather than describing
relations between entities/objects separately --
that is a major design flaw of STIX/CybOX.
IV) 80% (or say 60% or whatever -- at least a substantial percentage)
of STIX/CybOX has never been used so far. We should consider
simplifying drastically.
V) Another source of complexity: CybOX tries to be all-encompassing,
including the _expression_ of what are essentially certain types of
signatures (that is where the logical operations come in) as well
as the description of all kinds of observable stuff, going even
into rather detailed forensic information. In contrast to this, the
#1 use case most people are after is the sharing of rather simple
basic indicators ("observable patterns", "things to look
for").
So here is a bit of heresy: Maybe we should consider a
STIX-without-CybOX variant in which basic indicators can be
expressed in a very simple key/value-list-kind-of-way (where
mappings to default-representations into CybOX make sure that we
have well-defined semantics and the link to CybOX is preserved)
without logical operators.
Now in detail:
I) Regarding the big issue of XML vs. JSON vs. "something else":
I do not think that XML vs JSON will be decisive regarding the question whether STIX/CybOX will be relevant in future: there are advantages and disadvantages to both.
What, however, will be decisive, is whether STIX/CybOX really are helpful in expressing and consuming useful cyber-threat intelligence.
As others have already said on the mailing list: the main issues that currently make STIX and CybOX hard to deal with have *nothing* to do with the fact that we currently express things in XML rather than JSON.
I think the main questions in going forward is: what should we spend our (limited) resources on when moving towards the next major release?
What I am worried about is that switching the representation from XML to something else will lead to significant delay as well as draw focus from the topics that really matter to the format issue rather than the really pressing issues.
Maybe I am overestimating the required effort of switching the binding, but let us not be fooled by the idea that having a piece of code that produces/consumes, say, JSON, constitutes a language definition. The problem of defining a JSON binding is definitely harder that "just take XYZ's current JSON implementation (with XYZ being Mitre, Intelworks, Bluecoat or whatever) and be done with it." -- especially since none of the existing JSON implementations is even close to a complete coverage of either STIX or CybOX.
Right now, the tone on the mailing list suggests that it is a given that the upcoming major releases of STIX and CybOX will be based on a non-XML binding. Is that really so?
II) Now, regarding the problems we should really focus on.
Aharon has made the following points:
1) Complex logical operations
2) Heavily nested objects
3) Object Versioning
4) Relationships that go 50 levels deep backwards and forwards.
5) Making it easy just to share a single evil URL with someone. Reduce verbosity ?
6) XPATH in the Marking Structure. Or the marking object in general.
7) Multiple ways to say the same thing
8) Almost every field being optional
I agree to these points. Let me add/expand to that below.
III) Relationship information embedded in STIX entity / CybOX Objects
This, I feel, is the number one design flaw of STIX/CybOX in more than one
ways:
- there is no way to communicate a relationship between two things
without (re)defining at least one of the things (namely the 'thing'
into which the relationship info has to be embedded)
- not all relationships that one might want to express are supported
- why do I have to go through a campaign entity in order to
associate an indicator with a threat actor?
- why do I have to go through an incident to associate an incident
with a threat actor
- embedding of relationships leads to more complicated entity definitions
and expressions (cf. Aaron's 2nd item)
IV) High complexity, of which 80% (a guess) have never been used so far
If you look at python-stix/cybox, certainly the most comprehensive of
all implementations for producing and consuming STIX/CybOX: there is still
stuff missing. Also, if you look at what CybOX objects, or rather which parts
of which CybOX objects, are supported by the existing STIX/CybOX-based
systems, it is hard not to reach the following conclusion:
We take a sledgehammer to crack a nut (or, as we say in Germany: we are building
canons to shoot at sparrows).
So we have a standard, for which there is no system able to either produce
or ingest (and make sense of) even close to 100% of the standard.
That is a problem, because
- the unused 80% (or 50% or whatever) add complexity at all stages of
dealing with the standard (defining it, tooling for it, ...)
- the perceived benefit that we are future-proof in the sense
that pretty much everything can be expressed, is not really much
of a benefit: what use is it to be able to express something,
which nobody is able to process?
We try solve part of the problem with profiles that describe of how certain
use-cases are to be encoded ... but if we find that those profiles
use a 20% subset of the standard, maybe that tells us something?
V) Once more complexity: CybOX: Simple Indicators vs. Signatures vs. Observables/Forensics Information
The way, CybOX is currently used in CTI exchange is, again, taking
a sledgehammer against a nut:
- The indicators, most of us currently are able to communicate and
process are rather simple: a hash value, an URI, a domain name,
an email address.
So what usually happens is that the simplest of indicators are wrapped into
a CybOX object, only to be unwrapped by the receiver and stuck into
on of his six buckets of information he is able to deal with.
That is fine, I guess, though if the producer starts adding information
into CybOX objects, which is something the receiver's "unwrapping" code
will ignore ... and it may take the receiver some time to realize
that his automated processes are discarding information. Or the
importer/unwrapper may even break, interpret things wrongly, ...
Now take a look at what, e.g., MISP does: an indicator is
basically a key-value pair, where the key describes the kind of indicator
and the value the indicator itself:
- the problem of inadvertently missing information does not occur: either
I know how to deal with a certain indicator type or I do not
- adding new indicator types takes as little as adding a new key rather
than defining a whole new object type.
There are, of course, also drawbacks to the MISP way of doing things,
but currently, MISP is a lot closer to what is current practice in
sharing technical indicators than CybOX.
- Aharon mentioned the complex logical operations that are troublesome.
Their genesis, that is at least my understanding, lies in the
fact that STIX/CybOX owe a lot to OpenIOC.
However: OpenIOC at its heart is a language for expressing
signatures/patterns for a certain line of products and geared towards
the capabilities of these products. If CybOX/STIX had started out, e.g.,
from a line of thinking closer to a different product line, CybOX/STIX
might look quite different. Why do we have logical operators, but
not, say temporal, operators ("first this, then two times that, and
then finally again this, all within 5 seconds"), as we have
in SIEMs or network monitoring?
Do we need/want CybOX/STIX to be an all-encompassing generic
signature/pattern language? Or is that maybe a case for
the current test-mechanism feature that allows the embedding
of SNORT, OpenIOC and what have you?
- Recently, Sean reminded us on the mailing list, that CybOX
also has its uses in MAEC for malware expressions and in
the _expression_ of forensics information. It is great, that
CybOX is so powerful and versatile ... but most of its
power seems to be lost or even contra productive when it
comes to getting basic CTI exchange started.
Some time ago, Terry alerted us to the fine but important
distinction between observable pattern (what to look out for)
and observable instance (what has really been seen). Although
we have talked about use-cases of communicating observable
instances (I have seen this and that): the majority, I think,
is interested in exchanging stuff to look out for.
I may be committing heresy now, but let us think the unthinkable for
a moment: How about a profile of STIX that allows communication of
basic indicators (observable patterns) in a way that is closer to
MISP's key-value pairs (with a well-defined mapping into CybOX
proper), leaving full CybOX to cases in which observable instances
(i.e., something that has been observed) are to be communicated?
A mapping from such a simplified _expression_ into a standard
CybOX representation would then provide precise semantics and
retain the link to CybOX-proper.
Kind regards,
Bernd
----
Bernd Grobauer, Siemens CERT
---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail. Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php