OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti] Database Subcommittee


Hey, Bret -


Top-posting, as that seems to be the list norm...


Regarding your points,


1) Wholly in agreement. This notion of a language pushing you to express things in _one_ way eventually convinced this die-hard Perl-monger to switch to Python. 😊


4) Humans tend toward flowery modes of _expression_ but if machines are to parse STIX and actually _do_ something with it without resorting to complex NLP-based solutions, we'd better set our sights on identifying and rooting out redundant complexities. It was towards this end that I proposed the 'human-readable' bit back when the Report Object was being outlined. While I no longer believe that this is the correct solution, it does highlight a significant emerging problem. 


5) Regarding extension points, we might take a page from the OpenIOC playbook. When OpenIOC 1.1 was being prepared, Mandiant got a lot of requests from the community to support lots of taxonomic bits that were perceived as missing from the standard. Rather than clutter up the namespace, they decided to create a sort of XML scratch space at the top-level called parameters. If you required extension points, you could incorporate these into the parameters section under your own namespace. Other systems can safely ignore anything under parameters.


To give an example, I recently spoke to someone in financial services that was trying to suss out where in STIX to encode information regarding money mules (used in fraud/money laundering operations). One (messy) option would be to stick the data into a nested dict, serial that to JSON, and drop that into a STIX Description field. But this is messy and one tends to expect free-text fields to be human readable.


Alternatively, you could create (say) an FS-ISAC namespace under an STIX analogue to OpenIOC's Parameters block, drop your proposed Money Mule data structures in there, and exchange that structured information with other financial institutions without impacting the ability of other consumers to parse that STIX representation. Over time, the community could (taking this example) see that this Money Mule representation is widely used and decide to canonicalize that as part of the STIX standard.


[0]: https://github.com/mandiant/OpenIOC_1.1#the-parameters-section

mandiant/OpenIOC_1.1 · GitHub
Contribute to OpenIOC_1.1 development by creating an account on GitHub.


Cheers,
Trey
--
Trey Darley
Senior Security Engineer
Soltra | An FS-ISAC & DTCC Company
www.soltra.com



From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Jordan, Bret <bret.jordan@bluecoat.com>
Sent: Saturday, June 20, 2015 22:55
To: Patrick Maroney
Cc: Jerome Athias; Alex Pinto; cti@lists.oasis-open.org
Subject: Re: [cti] Database Subcommittee
 
Just to be clear  I believe the languages should be very expressive and when I say simple, that does not mean lack of expressiveness.  What I want is:

1) Ease of use "simple".  Meaning, there should be only one way of doing things and it should be super intuitive to do so.  We should not have to figure out and have discussions about how to map this basic thing  or how to do that basic thing.  It should just make sense.  And there should be only one way of doing it.

2) Common things should be super easy and clean, we should not have a lot of extra markup to support them.  Take my +1 example from the past.  The fact that I have 2000% markup for just a "+1 I have seen this too" is problematic when we need to scale.

3) I want a solution that is performant and super efficient.  I am not worried about the design for today, 10,000 indicators a day is not what I am thinking about.  I am thinking more of the 100 million -1 billion a day situation when STIX and TAXII is running on everyone's handhelds.  This is also going to impact the backend database systems and how we correlate and merge data.

4) The language should work very well in code. The goal of this project is for Machine-2-Machine learning, so therefore it needs to work well without any human interaction and it should be implementable in all forms of languages and delivery platforms.

5) We need to standardize on extension points and things we bring in to the language so that we all do the same thing. Infinite extensibility is bad and very problematic. People worry about introducing some other binding language and that we will be fracturing the market. Well I have bad news, we already have this problem today with extension points.  If you go though and implement a series of extension points (even OASIS CIQ) that I do not support, guess what.  I can not use any of your data.  Lets figure out what we are going to do and standardize on it. 

I want solution that is easy and simple to use.  This is the only way we get out of early adopter phase and move across the chasm in to mainstream adoption.  

Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 62A6 5999 0F7D 0D61 4C66 D59C 2DB5 111D 63BC A303
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Jun 19, 2015, at 16:03, Patrick Maroney <Pmaroney@Specere.org> wrote:

Don't disagree with the objectives in terms of identifying and addressing "Impediments to Adoption".  (e.g., "I've got 10 Billion Indicators on my Soltra-Edge Gateway, with 10,000s more arriving daily,, now what do I do with it?).  

However, I do want to emphasize what I think is a key point (and we circle around this point often when we engage in discourse on the merits (pro/con) of the inherent complexity/flexibility of _expression_ in CTI*

Some of the  foundational objectives of CTI were the definition and common adoption of a rich, highly expressive, flexible, and extensible [Language] for the [<== Inter-Exchange ==>] of Cyber Intelligence.  This approach provides the framework that we need to transform Cyber Intelligence from "My" Data Model to "Your" Data Model, while maintaing as much fidelity and context as is practical for a given Use Case.  

Now as a natural (and intentional outcome) as more CTI producers, consumers, "processors" (e.g. Vendor Products, Shared Community Tools and Frameworks) increasingly adopt common schematic representations, and/or common Taxonomies, the once ephemeral Inter-exchange vehicle/package becomes increasingly persistent as more and more producers, consumers, and "processors" can reliably process CTI in this (currently two-dimensional) "STIX" form.  This is a powerful outcome.

However, this should not distract from the core objectives of expressiveness of any "thing" in the Cyber-Domain.  Whether it's the sharing of a single  Indicator or the potential for new concepts like real-time dynamically evolving shared Analytics Models and Training Data Sets  (i.e., expressed in PMML and conveyed through STIX/TAXII).  These concepts require multi-dimensional semantic and temporal data models that need to be expressed with the same precision as XML.

So [+1] for community developed shared tooling,  reference implementations, etc. using any available data format (Relational, NOSQL, Property Graphs, RDF...I still chew up a lot of raw CSV/TSV Table Data), but argue for focus on expressiveness in the language vs. focus and perhaps constraint of the language as some have argued for as a means to reduce complexity for their current Use Cases and/or World Views.

Everyone is "right" including the "Less Filling" camp, So I'm just voting for the "Tastes Great" side.

BTW: That's probably my third use of what may be American Colloquialisms to express underlying concepts in less than 24 Hours, so I'm self-declaring a "foul" on myself



Patrick Maroney
Office:  (856)983-0001
Cell::     (609)841-5104


* CTI:   (...it's really nice to be able stop typing STIX/CybOX/TAXII  ;-).

From: <cti@lists.oasis-open.org> on behalf of "Jordan, Bret" <bret.jordan@bluecoat.com>
Date: Friday, June 19, 2015 at 2:58 PM
To: Jerome Athias <athiasjerome@gmail.com>
Cc: Alex Pinto <alexcp@mlsecproject.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Subject: Re: [cti] Database Subcommittee

And this is case in point of why I think you would be a great Co-Chair or Chair if Eric can not do it, for this work.  


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 62A6 5999 0F7D 0D61 4C66 D59C 2DB5 111D 63BC A303
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Jun 19, 2015, at 12:42, Jerome Athias <athiasjerome@GMAIL.COM> wrote:

The users have basically an issue managing and sharing information
because this information is stored in various ways and
representations. (because of different and non-interoperable
softwares)
A common language was needed. (just like we use English there, because
we don't all speak Chinese, Spanish, French or Russian, etc.)

While we want humans to share information between each other, by using
machines, these humans (users) need a way* to talk to machines
(computers).
If this language is seen/perceived as too complex (complicated, large
or 'grammatically' difficult to learn before being able to make "valid
sentences with the available words and rules of the language") by the
users; we need to assist them*.
(Computer programming is there to assist.)
More than just a "database subcommittee", this subcommittee could
support Software engineering.
https://en.wikipedia.org/wiki/Software_engineering

Relational databases represent, IMHO, a significantly interesting idea
to be explored to support and potentially enhance-extend the current
specifications, and facilitate understanding and Software development
around and using the STIX family of domain-specific languages (DSL).

I think that relational database schemas based and designed on the
data format specifications -could- facilitate the use of 4GLs tools in
order to build or generate easily (faster) Graphical User Interfaces*
intended to greatly simplify the 'complexity' of the OASIS-CTI
languages.
Being a mature development approach (explored for decades) this could
provide benefits such like a larger pool of skilled and available
resources (developers).

PS: Furthermore, the possible resultant application programming
interfaces (APIs), could be used for M2M communications too.


2015-06-19 19:50 GMT+03:00 Jordan, Bret <bret.jordan@bluecoat.com>:
Great points..


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 62A6 5999 0F7D 0D61 4C66 D59C 2DB5 111D 63BC A303
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can
not be unscrambled is an egg."

On Jun 19, 2015, at 10:43, Alex Pinto <alexcp@mlsecproject.org> wrote:

Ok, I think I get it now. By focusing on the “database” part of it, I became
a bit confused. Since the standards describe a data definition format, the
trivial and obvious solution would be to translate that to a relational
database verbatim. I am glad I am not an investor on these startups that are
struggling to do something like that. ;)

However, the larger implementation problem is a good one to address, in my
opinion. What I have seen people struggling with is how to efficiently
translate the data they have INTO the STIX data format. Say I have an IP
address indicator from the “Tap-dancing Penguin” threat actor. what is the
correct, unambiguous path, to translate that to the equivalent STIX object.
Do I need to create an Observable? Only an Indicator? Can Threat Actors be
linked directly to Indicators or do you need to have a Observable to do
that?

However, I STILL think that having something along the lines of these
“suggested recipes” of normal use-cases should be a part of each
subcommittee. Because if there is more than one way to generate the the same
piece of information on the STIX format, we would have failed to describe an
actual interoperable standard.

Don’t get me wrong. I LOVE this idea of making it more developer-friendly,
but I want to make sure we are focusing on the right things here.

Cheers,
Alex
--
Alex Pinto
Niddel
http://niddel.com
https://mlsecproject.org

On Jun 19, 2015, at 6:26 PM, Jordan, Bret <bret.jordan@bluecoat.com> wrote:

The subcommittee would create work products, documentation, and best
practices for using STIX, TAXII and CYBOX.  As I talk with start-ups and
other implementors / integrators, I hear a common theme.  "How do we
actually store this data and what is the best practices for doing so?".
This working group, in my mind, would address those issues and report back
to the TC with recommend best practices, examples, and documentation on how
to build the databases to actually make use of STIX, TAXII, and CYBOX.

You could even put in scope the query functions that should exist for each
language and how best to do those.  It would be nice to have a working group
focused on this effort.    And IMHO, I think this would help get a lot of
new people to STIX and TAXII up and running more quickly.


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 62A6 5999 0F7D 0D61 4C66 D59C 2DB5 111D 63BC A303
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can
not be unscrambled is an egg."

On Jun 19, 2015, at 08:55, alexcp@mlsecproject.org wrote:

I need some more time to structure a more complete response right now
(trying to catch flights out of Berlin) but I am really struggling to
understand how can this possible be on the scope of the standard.

Could you please elaborate how the actual database format would be relevant
for the standard discussion?



On Fri, Jun 19, 2015 at 4:28 PM, Jordan, Bret <bret.jordan@bluecoat.com>
wrote:

And I would nominate Jerome to Co-Chair this with Eric Burger.


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 62A6 5999 0F7D 0D61 4C66 D59C 2DB5 111D 63BC A303
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can
not be unscrambled is an egg."

On Jun 19, 2015, at 02:14, Jerome Athias <athiasjerome@GMAIL.COM> wrote:

+1

2015-06-19 6:11 GMT+03:00 Jordan, Bret <bret.jordan@bluecoat.com>:

About 9 months ago or so we tossed around the idea of setting up a
Subcommittee / Working group to look in to database requirements and build
photo-type examples for storying STIX and or TAXII data.  I would like to
propose that we do that here at OASIS and I would nominate Eric Burger to
Chair this committee.  He is after all a professor of computer science that
teaches database theory...  I think we would be very lucky to have him run
this group.


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 62A6 5999 0F7D 0D61 4C66 D59C 2DB5 111D 63BC A303
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can
not be unscrambled is an egg."


<signature.asc>



This e-mail message and any files transmitted with it contain legally
privileged, proprietary information, and/or confidential information,
therefore, the recipient is hereby notified that any unauthorized
dissemination, distribution or copying is strictly prohibited. If you have
received this e-mail message inappropriately or accidentally, please notify
the sender and delete it from your computer immediately.




--

------------------------------

This e-mail message and any files transmitted with it contain legally
privileged, proprietary information, and/or confidential information,
therefore, the recipient is hereby notified that any unauthorized
dissemination, distribution or copying is strictly prohibited. If you have
received this e-mail message inappropriately or accidentally, please notify
the sender and delete it from your computer immediately.
<signature.asc>






[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]