[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [cti] Database Subcommittee / conceptual/logical model subcommittee
My personal opinion would be that there should not be separate subcommittees for abstract semantics or for database design necessarily.
These seem like any particular guidance or other results would need to be specific to a given language (STIX or CybOX) rather than just general and as such would best be handled as work items within each language SC.
To me, as a general rule, more SCs will lead to more complexity in communication & coordination and far greater risk of overlap and conflict between SCs. Let’s create them where they are necessary to a specific scope of work that does not conflict with
the scope of existing SCs.
sean
From: "Jane Ginn -
jg@ctin.us" <jg@ctin.us>
Date: Wednesday, June 24, 2015 at 2:05 PM To: "Barnum, Sean D." <sbarnum@mitre.org>, Jerome Athias <athiasjerome@gmail.com>, "cory-c@modeldriven.com" <cory-c@modeldriven.com> Cc: "Eric.Burger@georgetown.edu" <Eric.Burger@georgetown.edu>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Database Subcommittee / conceptual/logical model subcommittee All: Building on Cory's suggestion... Jerome's observations... and Sean's note about using OWL or RDFS.... Would it make sense to establish a Sub-Committee that combines some of the issues associated with database design that have been discussed previously (RDBMS vs. NoSQL) with this need for clarification at the abstract level (conceptual & logical)? If so.... would the scope of such a Sub-Committee also cover implementation and tooling issues as was earlier suggested by Patrick? Further, what would be the tangible outputs, and how would they map to the STIX/TAXII/ & CYBOX Sub-Committees? Jane Ginn, MSIA, MRP -------- Original Message -------- From: "Barnum, Sean D." <sbarnum@mitre.org> Sent: Wednesday, June 24, 2015 10:41 AM To: Jerome Athias <athiasjerome@gmail.com>,Cory Casanave <cory-c@modeldriven.com> Subject: Re: [cti] Database Subcommittee / conceptual/logical model subcommittee CC: Eric Burger <Eric.Burger@georgetown.edu>,"cti@lists.oasis-open.org " <cti@lists.oasis-open.org> I just wanted to add a note of clarification here for the intent/scope of STIX and CybOX to date.
STIX and CybOX are intended to be Languages for expressing cyber threat information and cyber observable information respectively.
As such, they are more than simple data models or schemas. They also involve the conceptual model for their scope.
To date, the emergent and exploratory nature of this community seeking not only to formalize expressive representations for cyber threat information but to work collaboratively and iteratively to even figure out what that meant led to some necessary choices
to work from the bottom up.
This is why the language has initially been developed, refined and defined in the form of XML schema. The schematic level of abstraction gave us something concrete to discuss, model specific technical details and to experiment with real world data and
implementations in order to iterate and improve. XML schema was chosen not because it is some magical answer that everyone everywhere should use but rather because it is ubiquitous, supported by a mature body of tooling and synergistic standards (XPATH, Xpointer,
Xquery, etc.) and provides a powerful formal schema language to explicitly constrain syntax while enabling necessary flexibility. All of these things were needed to model and evolve a representation of an emergent knowledge space among a very diverse set of
players.
This approach served us well to successfully get us where we are today but it has always been recognized that specifying the language at this level of abstraction has significant downsides. First, it is difficult to define semantics and high level concepts
effectively at this level and choosing any particular technical implementation (XML, JSON, etc.) inherently introduces technology-specific characteristics that really are not part of the more generalized language.
In recognition of this, it has always been the plan to move the specification of the languages to a more general form once an appropriate level of maturity and stability had been reached (very similar to the plan to move to a formal standards body at the
appropriate time). The first steps toward this were put into motion several months back when work began on an implementation independent specification for STIX and a separate but related one for CybOX. It was decided that based on community needs and maturity
the appropriate first step in generalization would be to capture language structure and syntax in the form of a UML model that would be accompanied by a set of textual specifications to explain and characterize the UML model in a more human consumable form.
The draft set of these specifications for STIX 1.1.1 are currently available in the
STIXProject on github and the updated versions to STIX 1.2 should be completed within the next couple weeks. This will be the primary normative contribution to the CTI TC. There is a UML model for
CybOX also available but the set of accompanying full textual specs similar to STIX will not be created before transition to the CTI TC so that work will likely fall to the CybOX SC.
While UML models are formal and are abstracted from particular syntactic implementations (XML, JSON, etc.), they are not in all honesty really built to convey high-level conceptual models or explicit semantics of knowledge. They can be somewhat twisted
to serve this purpose (as we have done in the implementation independent specs) but the fact that they were designed to serve a systems engineering rather than knowledge engineering purpose leads to some shortcomings. The inability of UML models to effectively
convey high-level conceptual models and explicit knowledge semantics in a formal fashion is one of the key reasons the textual specification documents are required in addition to the UML. They not only provide more human-consumable characterizations of what
is in the UML but they are also needed to explain semantics that cannot effectively be expressed in the UML. The upside is that some of these semantics can now be explicit in the documents but it is in an informal form and still open to human interpretation.
What is ultimately needed for the language specs is a way to formally express the full range of language semantics and structure.
I have personally asserted for a long while, and I know many in the community agree, that the long term solution for specifications of the languages is to define and express them using mechanisms purpose built to define languages like this. That is, utilizing
semantic forms of specification such as OWL and RDFS. These forms while less familiar to many (part of the reason we decided to work from the bottom up) provide a way to clearly, explicitly, unambiguously and formally specify the high-level conceptual model
for the languages, directly map it to any number of more detailed conceptual models, and then directly map it to specific syntactic/schematic representations (logical models).
Many members of the community have been eager to begin working at this level but it was deemed important to first complete the abstraction work to the UML/textual specification level to serve as a XML-bias-free basis for initial semantic modeling. I propose
that some of the CTI TCs early work should be focused on these activities. In fact, I would fairly strongly assert that many of the refactoring issues on the table for STIX 2.0 (e.g., abstraction of several embedded structures (relationships, sources, assets,
victims, etc.) to separate constructs) will require semantic modeling in order to fully understand and get right. I think the semantic discussions and modeling as part of these activities could serve as some great initial steps towards more formal specifications
for the languages that serve not only better integration for each language across abstraction levels (conceptual to logical) but also better alignment and integration with related information representations within the cyber security sphere (MAEC, CAPEC, CVRF,
OVAL, OpenIOC, etc.) and outside the cyber security sphere.
So, that was a long contextual way of saying that I strongly agree with the need to understand and specify these languages across the abstraction spectrum (conceptual to lexical) but strongly feel that this should/must be done within the context of each
language (I.e. within the STIX and CybOX SCs with cross coordination via the TC) rather than as a separate activity.
Sean
On 6/24/15, 11:39 AM, "Jerome Athias" <athiasjerome@gmail.com> wrote:
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]