[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Towards joyful coding
Happy Friday! Here's something to ponder over the weekend. (This took me a while to write, so I hope you'll take some time to ponder, too, before responding. Thanks!)
Part 1: STIX/CybOX XML
I've been using STIX/CybOX for a couple months now. It has often been a struggle to locate the correct elements for expressing ideas. The STIX website has been very helpful, but sometimes I cannot find the relevant XSD or Python-STIX/CybOX code, even with liberal use of "grep". And sometimes I find something that seems relevant, but its location in the data model would make it awkward to use, like AttackerToolType (which I still can't figure out where it plugs in).
I understand how XML promises extensibility. And because of our desire to embrace and use other standards, rather than recreating them, we have a construct like this:
<stix:Indicator xsi:type='indicator:IndicatorType'>
It seems odd to have to define the type of the element when it's already an "Indicator" element. That feels like having to write HTML like this:
<html xsi:type="html:HtmlType"> <head xsi:type="html:HtmlHeaderType"> <title type="html:HtmlHeaderTitleType">
indicator = Indicator()
Part 2. Python-STIX and Python-CYBOX
And that brings us to the Python libraries. I am grateful to MITRE for producing these libraries, because they are easier than writing raw XML. Thank you, MITRE guys!
However, as a long-time Pythonista, I find myself struggling too much with the libraries. In large part, this is because the libraries are greatly influenced by the XML data model; when/if the XSDs get simpler, the libraries will also get simpler. As a
simple example, this code is a direct result of the XML complexity:
It gets more complicated if you want to mark specific elements, because then you're coding XPaths in your Python.
There are other factors that affect usability, but the one I have to work around the most is the namespace/alias handling. I discovered that namespaces are set on a
global level in both Python-STIX and Python-CybOX, and in earlier versions setting one did not necessarily set the other. (Thankfully, this is being corrected with the
mixbox in later versions.) In an otherwise object-oriented library, having a global setting like this is quite jarring. (And not thread safe.)
Part 3. Wishful thinking
When I was first developing web clients with Python's httplib, it felt much like developing code with STIX/CybOX XML and libraries. Then, along came
requests, and programming web clients became a pleasure. Now, the
Python httplib documentation recommends
requests. (I experienced a similar joy in moving from
lxml to BeautifulSoup.)
What I'm looking for is a Python library (and data model) that is a joy to use for CTI. I'd like my code to look something like this:
from cti_joy import cti # very simple import statements, please
cti.Spammer("b0b").emails_from("b0b@bad.example.com").containing_link("http://bad.example.com/bait.php").phishing_for_login_credentials("facebook.com") That's an example using the fluent style of programming, but something similar (and simpler?) could be done in a more imperative style.
indicator:Spammer123 spam:emails_from b0b@bad.example.com
indicator:Spammer123 spam:contains_link
http://bad.example.com/bait.php
indicator:Spammer123 phishing:login_credentials facebook.com
If I understand RDF correctly, it can be as strongly typed and validated as XML, and it's just as extensible. (RDF experts: please feel free to correct/enhance this understanding.)
Part 4. How to get from here to there?
If you're still reading (thank you!), then you may be wondering, "Is this guy trying to undermine STIX/CybOX?" Not at all. Interoperability is important, and there are many systems out there that produce/consume the XML just fine. I don't want to disturb
the standards at the machine level, because that's where we as a community hope to reap big wins against the Bad Guys.
I would, however, welcome a new wrapper library and/or a different data format that can be automatically and losslessly translated to/from the XML. Kind of like what
requests did for HTTP processing, or what markdown has done for HTML. I think we as a community could reap big wins by being able to
write code easier.
Beyond that, I don't have a great answer. I'm hoping maybe you do. Or, at least, we can ponder it together.
Have a great weekend, and I look forward to hearing your thoughts next week. Thanks!
John Anderson
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]