OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

cti message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [cti] The Adaptive Object-Model Architectural Style

Comments inline

From: <cti@lists.oasis-open.org> on behalf of "Jordan, Bret" <bret.jordan@bluecoat.com>
Date: Friday, November 13, 2015 at 1:17 PM
To: John Wunder <jwunder@mitre.org>
Cc: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Subject: Re: [cti] The Adaptive Object-Model Architectural Style

John this is really well said.  

I feel like we listened to every possible user requirement out there for STIX 1.0 and we tried to create a data-model that could solve every possible use case and corner case regardless of how small.  The one thing we sorely forgot to do is figure out what can developers actually implement in code or what are product managers willing to implement in code.  

[sean]This is a very innaccurate characterization of reality that attempts to drive people toward a particular conclusion based on misrepresentatively broad sweeping generalizations without any specific facts.
The development of STIX to date has been the result of a very diverse community of players representing a wide range of relevant perspectives. 
Even with this diversity there were countless use cases and specific details that were decided as out of scope for STIX. A truly full complete cyber threat information model to cover all the use cases that have been mentioned in the STIX community to date would be vastly more complex than what you see in STIX 1.X.
What did make it into STIX 1.x are capabilities that were specifically identified and asked for by stakeholders (including a wide range of vendors writing code) carrying out threat-informed cyber defense activities within various common contexts. These capabilities were based on things these stakeholders were doing currently in automated form but needed better consistency and interoperability, were doing currently in manual form but desired automated support, or were the challenge points in the activities they were doing currently that needed a solution to enable them to overcome those challenges and make leaps of progress. This included a significant amount of common high-level use cases needing to integrate threat information across separate tactical areas of activity (e.g. Integrating digital forensics, malware analysis and attribution analysis within incident response).
Oddball, corner-case contexts were not included and oddball, corner-case structures and fields were not included.
If there are specific things you feel should be out of scope then be specific and we can talk about them, understand why they were put there, understand who may want them and then decide as a group whether or not they should stay there. This is the only way that intelligent decisions can be reached.

Lets make STIX 2.0 something that meets 70-80% of the use cases and can actually be implemented in code by the majority of software development shops.  Yes, I am talking about a STIX Lite.  People can still use STIX 1.x if they want everything.  Over time we can add more and more features to the STIX 2.0 branch as software products that use CTI advance and users can do more and more with it.  

[sean]I do not agree that this is a rationale or feasible approach. I think it is radical and not justified once we drive to specifics.
First, let me say that a large proportion of the work to develop the current model was not simply to add specific “features” but rather to ensure that the overall model architecture hung together well for all of the aggregate “features”. It is not as simple as it sounds to just have 2.0 be a small subset of STIX capabilities and then "add more and more features” to it going forward. STIX 1.x has gone through numerous architectural refactorings along its path of development. Some of these were due to things we learned along the way and still know but most of them were due to the need of the architecture to support the new “features” that were added along the way. Your proposed approach would be forcing these refactorings (typically breaking backward compatibility) to occur all over again unnecessarily when we have already fought our way through them. And in the end much of the “complexity” in the current model is there because it is necessary to hang the various “features” of threat information together and as such will end up in your proposed new architecture anyway. I am all for identifying and resolving specific areas where things may be unnecessarily complex but we should not falsely assume that all complexity is unnecessary.

Rather than take this ill-advised radical approach that ignores our progress to date from a community far more diverse and in-depth than that in play in these discussions currently, I would argue that a much more appropriate and effective approach would be to deal with specifics rather than generalities. Dealing in generalities leads to ambiguity and differing interpretations, leads to a lack of clarity on exactly what is being considered or decided, leads to a lack of clarity for stakeholders where specific issues are in play that require them to focus and speak their opinions, leads to decisions being made without all the relevant context being understood, etc. 
Being specific addresses all of these shortcomings.

We currently have a list of specific issues identified with the current model (abstract sightings, data marking application approach, separate relationships, require Ids, etc.) that make assertions of unnecessary complexity in a way that can support clear discussion, understanding, crafting of proposals, weighing of pros and cons and decisions to be made in a way that we all know what we would be gaining or losing and lets stakeholders know where they need to speak up and be champions for perspectives they feel are important.  If there are specific areas of complexity that you think are not covered by currently identified issues, please add issues for those

I strongly propose that we behave like a formal standards body and deliberatively work our way through our specific identified issues to result in informed decisions and concrete well understood solutions.

Lets start with JSON + JSON Schema and go from there.  I would love to have to migrate to a binary solution or something that supports RDF in the future because we have SO MUCH demand and there is SO MUCH sharing that we really need to do something.

1) Lets not put the cart before the horse
2) Lets fail fast, and not ride the horse to the glue factory 
3) Lets start small and build massive adoption.  
4) Lets make things so easy for development shops to implement that there is no reason for them not to



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Nov 13, 2015, at 08:09, Wunder, John A. <jwunder@mitre.org> wrote:

So I’ve been waiting for a good time to outline this and I guess here is as good a place as any. I’m sure people will disagree, but I’m going to say it anyway :)

Personally I think of these things as four levels:

- User requirements
- Implementations
- Instantiation of the data model (XML, JSON, database schemas, an object model in code, etc)
- Data model

User requirements get supported in running software. Running software uses instantiations of the data model to work with data in support of those user requirements. The data model and specification define the instantiations of the data and describe how to work with them in a standard way.

The important bit here is that there’s always running software between the user and the data model. That software is (likely) a tool that a vendor or open source project supports that contains custom code to work specifically with threat intel. It might be a more generic tool like Palantir or whatever people do RDF stuff with these days. But there’s always something.

This has a couple implications:

- Not all user requirements get met in the data model. It’s perfectly valid to decide not to support something in the data model if we think it’s fine that implementations do it in many different ways. For example, de-duplication: do we need a standard approach or should we let tools decide how to do de-duplication themselves? It’s a user requirement, but that doesn’t mean we need to address it in the specs.

- Some user requirements need to be translated before they get to the data model. For example, versioning: users have lots of needs for versioning. Systems also have requirements for versioning. What we put in the specs needs to consider both of these.

- This is the important part: some user requirements are beyond what software can do today. I would love it if my iphone would get 8 days of battery life. I could write that into some specification. That doesn’t mean it’s going to happen. In CTI, we (rightfully) have our eyes towards this end state where you can do all sorts of awesome things with your threat intel, but just putting it in the data model doesn’t automatically make that happen. We’re still exploring this domain and software can only do so much. So if the people writing software are telling us that the user requirements are too advanced (for now), maybe that means we should hold off on putting it in the data model until it’s something that we can actually implement? In my mind this is where a lot of the complexity in STIX comes from: we identified user requirements to do all these awesome things and so we put them in the data model, but we never considered how or whether software could really implement them. The perfect example here is data markings: users wanted to mark things at the field level, most software isn’t ready for that yet, and so we end up with data markings that are effectively broken in STIX 1.2. This is why many standards bodies have requirements for running code: otherwise the temptation is too great to define specification requirements that are not implementable and you end up with a great spec that nobody will use.

Sorry for the long rant. Been waiting to get that off my chest for awhile (as you can probably tell).


On Nov 13, 2015, at 9:17 AM, Jerome Athias <athiasjerome@GMAIL.COM> wrote:

sorry for the others if off-topic.

Remember that a software is good only if it satisfies the users (meet,
or exceed, their requirements).
You can write 'perfect/optimized' code. If the users are not
satisfied; it's a bad software.

"If you can't explain it simply, you don't understand it well
enough.", Albert Einstein

Challenges are exciting, but sometimes difficult. It's about
motivation and satisfaction.

There is not programming language better than an other (just like OS);
it is just you that can select the best for your needs.

I did a conceptual map for the 'biggest Ruby project of the internet'
(Metasploit Framework), it's just a picture, but represents 100 pages
of documentation.
I think we could optimize (like for a maturity model) our approach of
resolving problems.

2015-11-13 17:02 GMT+03:00 John Anderson <janderson@soltra.com>:
The list returns my mail, so probably you'll be the only one to get my reply.

Funny, I missed that quote from the document. And it's spot on. As an architect myself, I have built several  "elegant" architectures, only to find that the guys who actually had to use it just. never. quite. got it. (sigh)

My best architectures have emerged when I've written test code first. ("Test-first" really does work.) I've learned that writing code--while applying KISS, DRY and YAGNI--saves me from entering the architecture stratosphere. That's why I ask the architects to express their creations in code, and not only in UML.

I'm pretty vocal about Python, because it's by far the simplest popular language out there today. But this principal applies in any language: If the implementation is hard to explain, it's a bad idea. (Another quote from the Zen of Python.) Our standard has a lot that's hard to explain, esp. to new-comers. How can we simplify, so that it's almost a no-brainer to adopt?

Again, thanks for the article, and the conversation. I really do appreciate your point-of-view,

From: Jerome Athias <athiasjerome@gmail.com>
Sent: Friday, November 13, 2015 8:45 AM
To: John Anderson
Cc: cti@lists.oasis-open.org
Subject: Re: [cti] The Adaptive Object-Model Architectural Style

Thanks for the feedback.
Kindly note that I'm not strongly defending this approach for the CTI
TC (at least for now).
Since you're using quotes:
"Architects that develop these types of systems are usually very proud
of them and claim that they are some of the best systems they have
ever developed. However, developers that have to use, extend or
maintain them, usually complain that they are hard to understand and
are not convinced that they are as great as the architect claims."

This, I hope could have our developers just understand
that what they feel difficult sometimes, is not intended to be
difficult per design, but because we are dealing with a complex domain
that the use of abstraction/conceptual approaches/ontology have benefits

Hopefully we can obtain consensus on a good balanced adapted approach.

2015-11-13 16:24 GMT+03:00 John Anderson <janderson@soltra.com>:
Thanks for the link. I really enjoy those kinds of research papers.

On Page 20, the section "Maintaining the Model" [1] states pretty clearly that this type of architecture is very unwieldy, from an end-user perspective; consequently, it requires a ton of tooling development.

The advantage of such a model is that it's extensible and easily changed. But I'm not convinced that extensibility is really our friend. In my (greatly limited) experience, the extensibility of STIX and CybOX have made them that much harder to use and understand. I'm left wishing for "one obvious way to do things." [2]

If I were given the choice between (1) a very simple data model that's not extensible, but clear and easy to approach and (2) a generic, extensible data model whose extra layers of indirection make it hard to find the actual data, I'd gladly choose the first.

Keeping it simple,

[1] The full wording from "Maintaining the Model":
The observation model is able to store all the metadata using a well-established
mapping to relational databases, but it was not straightforward
for a developer or analyst to put this data into the database. They would
have to learn how the objects were saved in the database as well as the
proper semantics for describing the business rules. A common solution to
this is to develop editors and programming tools to assist users with using
these black-box components [18]. This is part of the evolutionary process of
Adaptive Object-Models as they are in a sense, “Black-Box” frameworks,
and as they mature, they need editors and other support tools to aid in
describing and maintaining the business rules.

[2] From "The Zen of Python": https://www.python.org/dev/peps/pep-0020/

From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Jerome Athias <athiasjerome@gmail.com>
Sent: Friday, November 13, 2015 5:20 AM
To: cti@lists.oasis-open.org
Subject: [cti] The Adaptive Object-Model Architectural Style


realizing that the community members have different background,
experience, expectations and use of CTI in general, from an high-level
(abstracted/conceptual/ontology oriented) point of view, through a
day-to-day use (experienced) point of view, to a technical
(implementation/code) point of view...
I found this diagram (and document) interesting while easy to read and
potentially adapted to our current effort.
So just wanted to share.



To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]