OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-users message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-users] [cti-stix] [cti-users] MTI Binding


Right, we want to move to RDF so the meaning is encoded and people are free to exchange the CTI content in any of the various RDF based exchange formats (JSON-LD, RDF/XML, Turtle, etc) on the wire to support the same content across multiple formats all with the meaning encoded. 

On Mon, Oct 5, 2015 at 3:14 PM, Jordan, Bret <bret.jordan@bluecoat.com> wrote:
So it seems like you are using RDF as a means to make a case for STIX in the first place.  Yes, RDF would be interesting if people wanted to share information and there was no standard way of sharing it, thus you need to attach meaning to the data.  

This is what STIX does.  It makes a standard way for sharing CTI where the meaning is defined in the specification.  So all of the idioms and elements in STIX have meaning.  We are not trying to build a protocol to share arbitrary CTI, but rather, build a structure so that everyone's CTI can be encoded in STIX so that everyone can understand it. 


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg." 

On Oct 5, 2015, at 12:56, Shawn Riley <shawn.p.riley@GMAIL.COM> wrote:

From the human user/analyst/scientist using the CTI data being exchanged using RDF has a significant advantage over XML or JSON because is the meaning of the data is encoded in the RDF so instead of just reading the data the technology can understand the meaning of the data. If the technology can understand what the data means it can put the pieces of data together like a jigsaw puzzle grand master essentially organizing and “connecting the dots” for the human users. This supports technology enabled analytic tradecraft modernization of both individual users/analysts/scientists as well as overall organizational cyber defense analytic tradecraft improvement. If the meaning of the data is encoded with it then the data is easier for scientist, analysts, and increasingly non-expert users to understand and do something with it.


Below is some of the typical cybersecurity data and information users/analysts/scientists have to organize into some type of body of knowledge so they understand their cybersecurity ecosystem. If the technology can’t understanding the meaning of the data then it’s the humans who have to understand it and “connect the dots”.  


Configuration/Anomaly Reporting - Infrastructure Information - Risk Posture - Anomalies


Knowledge of Threat Actors - Threat Actor Infrastructure - Threat Actor Personas - Collected Threat Actor Indicators - Threat Actor Attribution - Trend Analysis - Victim Information


Incident Awareness - Incident Information - Incident Data - Infrastructure Impact and Effects - Investigations/cases - Alerting Indicators - Victim Information


Indications and Warnings - Events and Alerts - Tipping and Cueing - Warnings - Impact assessments - Potential Indicators


Vulnerability Knowledge - Vulnerabilities - Exploits - Potential Victim Information


Mitigation Strategies - Coordinated Action Plans - Courses of Action - Understanding of Achievable Mitigation Effects


Mitigation Actions and Responses - Computer Network Defense Situational Awareness - Action Tasking and Status - Effectiveness Reporting - After Action Reporting and Lessons Learned


Sadly, analytic tradecraft to understand the CTI and wider cybersecurity data, operationalize the CTI data, connect-the-dots between malicious activities, etc varies widely with some organizations having greater than 15 years of CTI analytic tradecraft experience and other organizations are just starting out who have basically no analytic tradecraft.


The Semantic eScience research paper I shared with the list discussed a Semantic eScience technology stack to collect all that cybersecurity data and information in order to apply object-based production using RDF. This research was investigation how to advance and modernize analytic tradecraft with science and big data technology. 


Object-based production allows the technology stack to systematically organize a body of cybersecurity knowledge for the operational ecosystem so it can enable the human cyber defenders to be more efficient and effective in the intellectual and practical activity encompassing the systematic study of the structure and behavior of the cybersecurity in the operational ecosystem.


I thought perhaps understanding a use case for STIX RDF data might help others to see another point of view. 


Shawn


On Mon, Oct 5, 2015 at 2:13 PM, Jason Keirstead <Jason.Keirstead@ca.ibm.com> wrote:

I think you're making a big assumption that most systems used to share CTI will be internet facing and/or publicly accessible, which isn't true.

The odds that CTI System A (originator of STIX Name space "CompanyA.com") can communicate directly with CTI System B (originator of STIX Name space "CompanyB.com") are actually quite low.

This is why the data has to be copied, otherwise there is no way at all to build a knowledge graph.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


<graycol.gif>"Bush, Jonathan" ---2015/10/05 02:48:22 PM---The use-case I had in mind was that instead of tools transporting data around the internet, copying



From: "Bush, Jonathan" <jbush@dtcc.com>
To: "'Jordan, Bret'" <bret.jordan@bluecoat.com>
Cc: "Sean D. Barnum" <sbarnum@mitre.org>, Jane Ginn <jane.ginn@gmail.com>, "Wunder, John A." <jwunder@mitre.org>, "cti-users@lists.oasis-open.org" <cti-users@lists.oasis-open.org>, "cti-stix@lists.oasis-open.org" <cti-stix@lists.oasis-open.org>
Date: 2015/10/05 02:48 PM
Subject: [cti-users] RE: [cti-stix] [cti-users] MTI Binding
Sent by: <cti-users@lists.oasis-open.org>





The use-case I had in mind was that instead of tools transporting data around the internet, copying it from one place to another, we could just have the data link to other referenced data that exists somewhere else in the CTI “ecosystem”. Why move all that data around when I can just point to it? And… if that leads me to a place that then points to other data locations, before long I will have a “net” of data that I can use to perform complex analytical analysis to answer questions that would otherwise be very difficult.
(I suppose I’m defining the semantic web here)

… at least that was where my head was at with it.

From: cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jordan, Bret
Sent:
Monday, October 05, 2015 1:42 PM
To:
Bush, Jonathan
Cc:
Sean D. Barnum; Jane Ginn; Wunder, John A.; cti-users@lists.oasis-open.org; cti-stix@lists.oasis-open.org
Subject:
Re: [cti-stix] [cti-users] MTI Binding

I have been reading a lot about JSON-LD, and I get how and why it might be interesting in a website context when you are sharing unknown data back and forth. Meaning there is no standard for the data you are sharing. Think user profile between Google, Twitter, Facebook etc. But, unless I am mistaken, the purpose of STIX is to define a standard for CTI so that we all share the same data.

Can someone explain why JSON-LD is needed in the CTI context. I just do not see why anyone that is building an application to use CTI would care since all of the data that will be shared between them is KNOWN and in a standard well known form, aka STIX... Please help me understand this use case.


Thanks,

Bret



Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."
      On Oct 5, 2015, at 11:20, Bush, Jonathan <jbush@dtcc.com> wrote:

      I would agree that some of the technologies involved with the Semantic web scare me a little bit (very complex, many seem pretty academic), but at least if we go with a structure that sets us up for this sort of “linked” data thinking now, we leave that door open for the future.

      From: cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jordan, Bret
      Sent:
      Monday, October 05, 2015 1:11 PM
      To:
      Bush, Jonathan
      Cc:
      Sean D. Barnum; Jane Ginn; Wunder, John A.; cti-users@lists.oasis-open.org; cti-stix@lists.oasis-open.org
      Subject:
      Re: [cti-stix] [cti-users] MTI Binding

      http://manu.sporny.org/2014/json-ld-origins-2/

      Thanks,

      Bret



      Bret Jordan CISSP
      Director of Security Architecture and Standards | Office of the CTO
      Blue Coat Systems
      PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050
      "Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."
          On Oct 5, 2015, at 10:34, Bush, Jonathan <jbush@dtcc.com> wrote:

          Great information here, thank you Sean.

          It sounds like we are talking about
          Use-Cases -> UML (or OWL/RDF) -> JSON-LD (context portion mapping model to JSON constructs to use) -> JSON (actual instances of content)

          I could get behind that.

          From: Barnum, Sean D. [mailto:sbarnum@mitre.org]
          Sent:
          Monday, October 05, 2015 10:14 AM
          To:
          Bush, Jonathan; 'Jane Ginn'; Wunder, John A.
          Cc:
          cti-users@lists.oasis-open.org; cti-stix@lists.oasis-open.org
          Subject:
          Re: [cti-stix] Re: [cti-users] MTI Binding

          I don’t think I would concur with Jane’s characterization that this represents a "significant shift in the level at which we are approaching the problem set”. Rather, I think it fits very well into how we have been approaching the problem but may represent an acceleration along our path. We began the STIX efforts leveraging XSD to specify the language because we as a community did not yet know or agree on what CTI was and XSD provided a good structured mechanism everyone could deal with to collaboratively define and experiment in an explicit way. It was recognized since the beginning that this was only a temporary approach and that we would eventually need a richer more semantic specification with a derived abstraction stack like I described in my earlier post (ontology/data-model, binding specs, implementations, instance data). We have been working along this path as a community gradually separating out and establishing these abstractions for quite a while now. As we have evolved the technology along the planned path we have tried to be mindful not only of the raw technical capabilities of given technologies to support targeted use cases and represent necessary information but also of the community’s ability and willingness to understand, accept and adopt them. I and several others in the community have long posited that a full semantic model (likely using something like OWL/RDF) would likely be the best long term form for the ontology/data-model layer as that is exactly what such technologies are designed for and would provide excellent flexibility but also excellent consistency and potential for automated transformation. To date this has been viewed as a goal to work towards but not something to be pushed too soon as the community may not be familiar enough with it yet. This led to the interim step of leveraging UML to specify the normative model for the language. The UML-based specifications we have today, while not fully semantic, provide us the practical ability to fully instantiate the desired abstraction stack for STIX. I continue to hold the opinion (as do several others in the community like Pat, Paul, Shawn, etc.) that we should continue to evolve towards a full explicit semantic form of specification for the ontology/data-model but it is still unclear how fast that evolution should occur. The plan discussed several times for STIX 2.0 was to stick with the UML + text docs form but begin to work in some semantic modeling snippets as part of discussions on a few of the refactoring issues (e.g. separating out relationships) that are semantic in their roots and likely include a few OWL-style diagrams in the spec text docs. This could then be a good introduction to these approaches for the community and an initial basis for future evolution to fully semantic models.

          While Jonathan is correct that most XML-based implementations would require some major retooling to support a JSON-based serialization form, I don’t think that JSON-LD inherently brings any further burden that that. To be clear, instance STIX data using a JSON-LD approach would still be &pure JSON”. It is just that the structure of that JSON would be aligned to the ontology/data-model higher in the abstraction stack. To use a simple analogy, think of “pure JSON” not as a human language like English or French but rather as something far lower level such as human cursive handwriting. It can be used to convey all sorts of different semantics and structure (English, French, etc.). The LD/context portions of JSON-LD are what allow someone reading the cursive to recognize whether they are reading English or French and to understand the actual meaning of what is being written. This mapping of meaning from the low-level serialization format to the higher-level ontology/data-model is what the two middle layers of the abstraction stack are all about. These layers are required to be there no matter which approach we take. Without these layers expressed content, regardless of whether it is JSON, XML, protobuf or whatever, would not be interpretable of interoperable. JSON-LD provides one option for defining these layers for a targeted “pure JSON” end serialization for instance data (which is what I believe Bret and several others really want). Another option would be to write the JSON binding spec as some other form of rules (including potentially human language) and then specify the implementation using something like JSON Schema. JSON-LD simply provides an explicit structured way to tackle these two layers.


          Does that make sense?
          Again, anyone knowledgeable on these topics should feel free to point out where they believe my characterizations are incorrect or unclear.

          Sean

          From: "cti-stix@lists.oasis-open.org" on behalf of "Bush, Jonathan"
          Date:
          Saturday, October 3, 2015 at 8:15 AM
          To:
          'Jane Ginn', John Wunder
          Cc:
          "cti-users@lists.oasis-open.org", "cti-stix@lists.oasis-open.org"
          Subject:
          RE: [cti-stix] Re: [cti-users] MTI Binding

          I would agree. JSON-LD could be an incredibly powerful way to represent intelligence data, but it represents a fundamental shift that will require a major retooling for most implementations to really take advantage of it. The good news is that tools (such as Soltra products to be all about “me” for a second) could ease into that implementation by thinning the implementation down to pure JSON at first (I believe, someone correct me if I’m wrong here). The real question is, will we as implementers get to the point where we really jump all in and represent data using the “LD” portion of the concept?

          Again, looks promising (after all, if Google and Facebook are using it to represent complex data, why shouldn’t we be paying attention), but do we all know what we would be buying in to?

          From: cti-stix@lists.oasis-open.org [mailto:cti-stix@lists.oasis-open.org] On Behalf Of Jane Ginn
          Sent:
          Friday, October 02, 2015 8:45 PM
          To:
          Wunder, John A.
          Cc:
          cti-users@lists.oasis-open.org; cti-stix@lists.oasis-open.org
          Subject:
          [cti-stix] Re: [cti-users] MTI Binding

          Hi All:
          While reading through this thread it occurred to me that the JSON-LD suggestion represents a significant shift in the level at which we are approaching the problem set. Cory has long been arguing for us to shift our focus to a semantic model that can serve as a language agnostic approach to solving the CTI sharing problem. Bret has been pushing for JSON as a tool to help us achieve more wide spread adoption. We currently have bindings in XML and Python... but no MTI for moving forward with STIX 2.0.
          JSON-LD appears to address several of our issues at a higher level of abstraction.
          I'm also intrigued by the potential, from the POV of STIX cosumers, at how PMML can be deployed seamlessly to use wire speed data on attacks for predictive modelling... or at least deploying the myriad of tools for predictive modelling. I expect this is an area of white space in the market that will be picked up by a vendor and developed as an enterprise solution. We just need to get the front end right for the integration.
          Jane Ginn
          Cyber Threat Intelligence Network


          DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.


          DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.


      DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.


DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.











[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]