Towards joyful coding

cti-users message

Subject: Towards joyful coding

From: John Anderson <janderson@soltra.com>

To: "cti-users@lists.oasis-open.org" <cti-users@lists.oasis-open.org>

Date: Fri, 16 Oct 2015 14:37:22 +0000

Happy Friday! Here's something to ponder over the weekend. (This took me a while to write, so I hope you'll take some time to ponder, too, before responding. Thanks!)

Part 1: STIX/CybOX XML

I've been using STIX/CybOX for a couple months now. It has often been a struggle to locate the correct elements for expressing ideas. The STIX website has been very helpful, but sometimes I cannot find the relevant XSD or Python-STIX/CybOX code, even with liberal use of "grep". And sometimes I find something that seems relevant, but its location in the data model would make it awkward to use, like AttackerToolType (which I still can't figure out where it plugs in).

I understand how XML promises extensibility. And because of our desire to embrace and use other standards, rather than recreating them, we have a construct like this:

<stix:Indicator xsi:type='indicator:IndicatorType'>

It seems odd to have to define the type of the element when it's already an "Indicator" element. That feels like having to write HTML like this:

Thankfully, the Python-STIX library handles this transparently, so all I have to code is:

indicator = Indicator()

(Which raises side-question about how the Python-STIX library would support Indicators with alternative xsi:type values. But let's not go down that rabbit rail right now, thanks.)

Part 2. Python-STIX and Python-CYBOX

And that brings us to the Python libraries. I am grateful to MITRE for producing these libraries, because they are easier than writing raw XML. Thank you, MITRE guys!

However, as a long-time Pythonista, I find myself struggling too much with the libraries. In large part, this is because the libraries are greatly influenced by the XML data model; when/if the XSDs get simpler, the libraries will also get simpler. As a simple example, this code is a direct result of the XML complexity:

indicator = Indicator()
indicator.handling = Marking()
indicator.handling.add_marking(MarkingSpecification())

It gets more complicated if you want to mark specific elements, because then you're coding XPaths in your Python.

There are other factors that affect usability, but the one I have to work around the most is the namespace/alias handling. I discovered that namespaces are set on a global level in both Python-STIX and Python-CybOX, and in earlier versions setting one did not necessarily set the other. (Thankfully, this is being corrected with the mixbox in later versions.) In an otherwise object-oriented library, having a global setting like this is quite jarring. (And not thread safe.)

Part 3. Wishful thinking

When I was first developing web clients with Python's httplib, it felt much like developing code with STIX/CybOX XML and libraries. Then, along came requests, and programming web clients became a pleasure. Now, the Python httplib documentation recommends requests. (I experienced a similar joy in moving from lxml to BeautifulSoup.)

What I'm looking for is a Python library (and data model) that is a joy to use for CTI. I'd like my code to look something like this:

from cti_joy import cti # very simple import statements, please

cti.Spammer("b0b").emails_from("b0b@bad.example.com").containing_link("http://bad.example.com/bait.php").phishing_for_login_credentials("facebook.com")

That's an example using the fluent style of programming, but something similar (and simpler?) could be done in a more imperative style.

As for the simpler data model, I have the sneaking suspicion that what I want is RDF. Something like this:

indicator:Spammer123 spam:emails_from b0b@bad.example.com

indicator:Spammer123 spam:contains_link http://bad.example.com/bait.php

indicator:Spammer123 phishing:login_credentials facebook.com

If I understand RDF correctly, it can be as strongly typed and validated as XML, and it's just as extensible. (RDF experts: please feel free to correct/enhance this understanding.)

Part 4. How to get from here to there?

If you're still reading (thank you!), then you may be wondering, "Is this guy trying to undermine STIX/CybOX?" Not at all. Interoperability is important, and there are many systems out there that produce/consume the XML just fine. I don't want to disturb the standards at the machine level, because that's where we as a community hope to reap big wins against the Bad Guys.

I would, however, welcome a new wrapper library and/or a different data format that can be automatically and losslessly translated to/from the XML. Kind of like what requests did for HTTP processing, or what markdown has done for HTML. I think we as a community could reap big wins by being able to write code easier.

Beyond that, I don't have a great answer. I'm hoping maybe you do. Or, at least, we can ponder it together.

Have a great weekend, and I look forward to hearing your thoughts next week. Thanks!

John Anderson

Follow-Ups:

RE: Towards joyful coding
- From: "Foley, Alexander - GIS" <alexander.foley@bankofamerica.com>