Hi Terry,
I missed this response earlier.
Thanks for the feedback. We can go through these when we talk.
-Marlon
From: Terry MacDonald [mailto:terry@soltra.com]
Sent: Monday, March 14, 2016 6:49 PM
To: Taylor, Marlon; Mates, Jeffrey CIV DC3/DCCI; 'Jordan, Bret'; Mark Davidson
Cc: Jason Keirstead; cti@lists.oasis-open.org; Taylor, Marlon
Subject: RE: [cti] RE: [Non-DoD Source] RE: [cti] RE: Versioning Background Docs
Hi Marlon,
I was originally pro-hashed IDs. They seem simple enough, but the restrictions they impose create a huge headache for maintaining relationships over the lifecycle
of the objects. A huge headache that was deemed by the community (over extensive discussions) to cause more trouble than they solved.
Jason Keirstead posted earlier my comments made in slack describing why Hashed ID’s are sub-optimal and won’t work, but I’ll outline a slightly deeper discussion
below…
Explicit Relationships versus Implicit Relationships
In STIX v1.x we had incremental updates (via implicit relationships) and major updates (via explicit relationships).
Incremental updates related the various versions of an object by the fact that the object ID stays the same – The object ID acts as the ‘key’ that identifies
the object, and all updates of that object keep the same object ID. Updated versions of the Object are tracked through the use of a separate version control field. The versions are related implicitly.
Major updates related the various versions of an object by explicitly creating a relationship between the old object and the new object. The object ID changes
between versions of objects, so the only way to show that the new object is an updated version of the old object is to create a brand new relationship object to join the old and new versions of the object together.
Hashed Object ID and the knock-on effects
Hashed object ID’s mean that the Object ID is based on the hash computed from the contents of the object itself. This has a few really cool benefits:
·
No-one can change the Object contents, as the recipient will be able to check the content against the object ID and will know if it’s been tampered with
·
This enforces immutability
But it also has massive downsides:
·
We force explicit relationships because we cannot perform incremental updates with hashed IDs
·
We force every single TAXII server to always track
every version of every object. (Implicit relationships don’t require this).
·
Now all other relationships sent previously are pointing to the wrong version of the object. The relationships will need to all be republished by the relationship object creators,
or we force all relationships to be transitive, and instead require every consumer implementation to always walk the hierarchy of version updates every time they wish to follow the list of explicit relationships.
·
We now have to send at least one relationship object and the new object every single time we do an update
The alternative: Randomly created Object IDs and Incremental Updates
The implicit versioning scheme, where the object always maintains its Object ID, and we update that one object makes things a LOT simpler. Updates are done simply
be the object creator publishing an object with the same Object ID as the previous version of the Object, and simply increasing the version number in some way (we’re still discussing how). This has many benefits:
·
It’s simple to understand (big plus for new users)
·
All relationships stay valid when the object is updated
·
TAXII Servers are free to only keep the last version of the object or to keep all the versions because the relationships remain valid in either case.
·
There is no ‘walking the relationship hierarchy’ to find all the relationships.
·
We are free to use which ever cryptographic solution we wish to for signing the objects to authenticate the objects.
There is one downside as far as I can tell:
·
Anyone can change the object contents (which we can stop by adding an HMAC when we do cryptography in a later version)
My vote is for Incremental Versioning and Implicit relationships.
There should only be one way to version the one object. It should always keep the same ID during it's lifecycle, and should have a version field (or some other
field that changes per version) that tracks which version the object is.
We had two ways to version in STIX v1.x (i.e. major updates or incremental) and it didn't work.
Cheers
Terry MacDonald
Senior STIX Subject Matter Expert
SOLTRA | An FS-ISAC and DTCC Company
+61 (407) 203 206 |
terry@soltra.com
Some for any confusion. I do remember the TC talking about GUIDs and HASHes but I don't remember the use-cases/scenarios/etc. which lead the TC go against HASH based. Are they available for
TC consideration in this discussion?
I'm unaware of any conceptual workflows/use-cases that will break with insurmountable repair due to this change. I think it's more important for the TC to be able review any rationale that was to make its decisions so we as a whole can quickly reference and
re-evaluate its stand on any decision.
Why are HASH IDs bad?
-Marlon
From:
cti@lists.oasis-open.org on
behalf of Mates, Jeffrey CIV DC3/DCCI
Sent: Monday, March 14, 2016 12:43:46 PM
To: 'Jordan, Bret'; Mark Davidson
Cc: Jason Keirstead; Taylor, Marlon; cti@lists.oasis-open.org;
marlon.taylor@us-cert.gov
Subject: RE: [cti] RE: [Non-DoD Source] RE: [cti] RE: Versioning Background Docs
I certainly understand concerns about deterministic IDs breaking workflows and not working in a number of potential use cases. It might make sense to simply allow
IDs to follow the UUID v4 and UUID v5 specs. That way organizations that want to use deterministic IDs can, while those that don't have no need to. Ultimately because of how the UUID spec works out both will have the same length, and an outside observer
will only notice a single character change between the two.
From a parsing standpoint handling something like xxxxxxxx-xxxx-4xxx-xxxx-xxxxxxxxxxxx instead of xxxxxxxx-xxxx-5xxx-xxxx-xxxxxxxxxxxx is pretty trivial as both will accomplish the same thing.
Jeffrey Mates, Civ DC3/DCCI
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computer Scientist
Defense Cyber Crime Institute
jeffrey.mates@dc3.mil
410-694-4335
-----Original Message-----
From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org]
On Behalf Of Jordan, Bret
Sent: Monday, March 14, 2016 12:09 PM
To: Mark Davidson
Cc: Mates, Jeffrey CIV DC3/DCCI; Jason Keirstead; Taylor, Marlon; cti@lists.oasis-open.org;
marlon.taylor@us-cert.gov
Subject: Re: [cti] RE: [Non-DoD Source] RE: [cti] RE: Versioning Background Docs
And for further clarification and to support Trey's statements. This TC talked about deterministic IDs at great length and it was decided that we would not go down that path. With Mark, I believe we have strong consensus to stick with the current ID patterns
we have. If this is not the case, then we will need to take this to a ballot. Things like IDs are fundamental and we need to figure these out before we do anything else. Thus the reason we had this discussion a few months ago.
Deterministic IDs may offer interesting use cases but also run the risk of breaking a lot of workflow that we are now building.
Thanks,
Bret
Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO Blue Coat Systems PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050 "Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."
On Mar 14, 2016, at 10:03, Mark Davidson <mdavidson@soltra.com> wrote:
Jeff,
Can you help me understand your perspective? In STIX 1.x, versioning was handled using the timestamp field (and would seem to align with your post, unless I’m mis-reading it) but I’m not sure I’ve seen any discussion about using timestamp for versioning
in 2.0. Are you proposing that we use timestamps for versioning in 2.0, or am I misunderstanding your comment?
Thank you.
-Mark
On 3/14/16, 11:52 AM, "Mates, Jeffrey CIV DC3/DCCI" <cti@lists.oasis-open.org on behalf of Jeffrey.Mates@dc3.mil>
wrote:
My understanding is that in general versioning should be handled using the
CTI Core "created_at" attribute which exists on both objects and
relationships. If this changes any object with a deterministic hash would
also have its GUID change. As such different versions of an object would
respect each other's unique GUIDs thus protecting referential integrity.
Even without a deterministic hash this would still be possible by simply
generating a new GUID every time a new version of an object or relationship
is produced.
Jeffrey Mates, Civ DC3/DCCI
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Computer Scientist
Defense Cyber Crime Institute
jeffrey.mates@dc3.mil
410-694-4335
-----Original Message-----
From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org]
On Behalf
Of Jason Keirstead
Sent: Monday, March 14, 2016 11:27 AM
To: Taylor, Marlon
Cc: cti@lists.oasis-open.org; Mates, Jeffrey CIV DC3/DCCI;
marlon.taylor@us-cert.gov
Subject: [Non-DoD Source] RE: [cti] RE: Versioning Background Docs
Are you saying that versions will only exist on relationship objects? How
will that help me figure out if a given threat actor's description is the
most recent.
-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security |
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
Inactive hide details for "Taylor, Marlon" ---03/14/2016 12:07:46
PM---Correct. Hashing won't provide that capability. Relation"Taylor,
Marlon" ---03/14/2016 12:07:46 PM---Correct. Hashing won't provide that
capability. Relationships will provide what you're looking for.
From: "Taylor, Marlon" <Marlon.Taylor@hq.dhs.gov>
To: Jason Keirstead/CanEast/IBM@IBMCA
Cc: "Mates, Jeffrey CIV DC3/DCCI" <Jeffrey.Mates@dc3.mil>,
"cti@lists.oasis-open.org" <cti@lists.oasis-open.org>,
"marlon.taylor@us-cert.gov" <marlon.taylor@us-cert.gov>
Date: 03/14/2016 12:07 PM
Subject: RE: [cti] RE: Versioning Background Docs
________________________________
Correct. Hashing won't provide that capability.
Relationships will provide what you're looking for.
-Marlon
________________________________
From: Jason Keirstead
Sent: Monday, March 14, 2016 10:56:04 AM
To: Taylor, Marlon
Cc: Mates, Jeffrey CIV DC3/DCCI; cti@lists.oasis-open.org;
marlon.taylor@us-cert.gov
Subject: RE: [cti] RE: Versioning Background Docs
Apologize for my confusion but I don't really understand what is being
discussed in this thread.
Are people talking about IDs or Versions? What does hashing have to do with
versioning?
I (hope?) people are not advocating to simply hash the contents of the
object and use that as a version? That is not workable. A version has to be
continually incrementing. I need to be able to look at a version and know if
it is the latest version or if it is stale. You can't do that with hashes.
-
Jason Keirstead
STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security |
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
Inactive hide details for "Taylor, Marlon" ---03/14/2016 11:42:28 AM---Hi
All, Jeff and I spoke offline and we are in agreement"Taylor, Marlon"
---03/14/2016 11:42:28 AM---Hi All, Jeff and I spoke offline and we are in
agreement with the hash based approach. Some takeaway
From: "Taylor, Marlon" <Marlon.Taylor@hq.dhs.gov>
To: "Mates, Jeffrey CIV DC3/DCCI" <Jeffrey.Mates@dc3.mil>,
"cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Cc: "marlon.taylor@us-cert.gov" <marlon.taylor@us-cert.gov>
Date: 03/14/2016 11:42 AM
Subject: RE: [cti] RE: Versioning Background Docs Sent by:
<cti@lists.oasis-open.org>
________________________________
Hi All,
Jeff and I spoke offline and we are in agreement with the hash based
approach. Some takeaways:
- cleared up "shallowness" of shallow objects
- conveyed the idea of relationships which contain arrays of ids (he calls
them link aggregators)
As we finalize objects across the TC we can go into object-specific required
fields. Ex: should every Indicator have an observable?
Keep up the feedback.
-Marlon
________________________________