OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-cybox message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring


>I need to read up more on JSON-LD myself, but I think Jason is largely correct, in that we’ll have a data model and a serialization of it that includes the corresponding schemas (likely JSON). 


Yes, this is the plan.

sean

From: <cti-cybox@lists.oasis-open.org> on behalf of Steve Cell <ikirillov@mitre.org>
Date: Tuesday, November 3, 2015 at 7:11 AM
To: Jason Keirstead <Jason.Keirstead@ca.ibm.com>, John Anderson <janderson@soltra.com>
Cc: "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Subject: Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring

I need to read up more on JSON-LD myself, but I think Jason is largely correct, in that we’ll have a data model and a serialization of it that includes the corresponding schemas (likely JSON). 

Going back to the original discussion, I think the broad questions are:

  1. Do we need a controlled vocabulary around hashing algorithms?
  2. How should non-standard/esoteric hashes be captured?
My thoughts are:
1. No. Controlled vocabularies make the most sense when there is no expectation that what they’re capturing is complete and/or there is a large need for it to be customized by content producers. I don’t think this is the case with cryptographic hashing algorithms; they’re largely stable and standardized for the most part.
2.  I’m a fan of key value pairs for their simplicity, so I think having either a separate type or separate field for the custom hash name (as in my proposal below) is how I would approach it:

{
"file" : {
"hashes" : [
{
"hash": "3773a88f65a5e780c8dff9cdc3a056f3",
"type": ”md5"
},
{
"hash": "f49125dac3:352bb35ffrca2:a123dc4599245",
"custom_type": "superhash" # A "custom" hash type.
},
]
}
}

Regards,
Ivan

From: Jason Keirstead
Date: Monday, November 2, 2015 at 1:04 PM
To: John Anderson
Cc: "cti-cybox@lists.oasis-open.org", Ivan Kirillov
Subject: Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring

This is not the same thing.... see JSON-Schema for how to validate against a JSON schema (http://json-schema.org/example2.html).

Namely, you define your schema in a different JSON document. That document can be used to validate any other document. Type information in the content messages are not necessary for validation to a schema, in fact, it's superfluous as the schema defines the type.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


Inactive hide details for John Anderson ---2015/11/02 01:54:12 PM---"All a developer would do with that @type information at thJohn Anderson ---2015/11/02 01:54:12 PM---"All a developer would do with that @type information at the top level and file level is throw it aw

From: John Anderson <janderson@soltra.com>
To: Jason Keirstead/CanEast/IBM@IBMCA
Cc: "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, "Kirillov, Ivan A." <ikirillov@mitre.org>
Date: 2015/11/02 01:54 PM
Subject: Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring
Sent by: <cti-cybox@lists.oasis-open.org>





"All a developer would do with that @type information at the top level and file level is throw it away, so it is extra bytes that are not required in the message." That sounds like throwing away all the XML namespace info, too. Sure, you don't need it...if you're not trying to validate against a published schema.

Is that what you mean?





From: cti-cybox@lists.oasis-open.org <cti-cybox@lists.oasis-open.org> on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Sent:
Monday, November 2, 2015 12:50 PM
To:
John Anderson
Cc:
cti-cybox@lists.oasis-open.org; Kirillov, Ivan A.
Subject:
Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring

I've been using JSON extensively for a very long time, but I don't know anything about JSON-LD, and will not pretend to.

All I am saying is, having superfluous information in the message should be strongly discouraged.. we need to try to be as terse as possible. A big reason for the move to JSON in the first place was to reduce the message overhead affiliated with XML.

All a developer would do with that @type information at the top level and file level is throw it away, so it is extra bytes that are not required in the message. Inside the "hash" level obviously it is required and has meaning, and should be present there.

The root concept is - if the type of an attribute is defined in the specification then there is no reason to have it as part of the message.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


John Anderson ---2015/11/02 01:45:25 PM---So, that's the question: How does JSON-LD extend vocabularies, and how does that affect the JSON rep

From:
John Anderson <janderson@soltra.com>
To:
Jason Keirstead/CanEast/IBM@IBMCA
Cc:
"cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, "Kirillov, Ivan A." <ikirillov@mitre.org>
Date:
2015/11/02 01:45 PM
Subject:
Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring
Sent by:
<cti-cybox@lists.oasis-open.org>





So, that's the question: How does JSON-LD extend vocabularies, and how does that affect the JSON representation? How would you express the idea of a custom algorithm hash, Jason?




From:
cti-cybox@lists.oasis-open.org <cti-cybox@lists.oasis-open.org> on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Sent:
Monday, November 2, 2015 12:36 PM
To:
John Anderson
Cc:
cti-cybox@lists.oasis-open.org; Kirillov, Ivan A.
Subject:
Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring

I would think the schema will be defined as part of the Cybox 3.0 specification itself, will it not?

The schema can not change once defined without envisioning the standard. When someone parses that "file" attribute, they will always expect the exact same data structures beneath it.
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


John Anderson ---2015/11/02 01:28:38 PM---Thanks, Jason. I think I understand. Are you saying that the "@context" will define the schema, and


From:
John Anderson <janderson@soltra.com>
To:
Jason Keirstead/CanEast/IBM@IBMCA
Cc:
"Kirillov, Ivan A." <ikirillov@mitre.org>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Date:
2015/11/02 01:28 PM
Subject:
Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring





Thanks, Jason. I think I understand. Are you saying that the "@context" will define the schema, and therefore the schema for all sub-items as well?

If so, then "superhash" would be an extension to the hash "algorithm" (renamed) vocabulary, courtesy of the "mycybox++" context.

That would simplify the JSON to this:


{
"@context": "
http://cybox.example.com/mycybox++",
"@type": "Observable",
"file" : {
"hashes" : [
{
"hash": "3773a88f65a5e780c8dff9cdc3a056f3",
"algorithm": "md5" # default type defined in CybOX
},
{
"hash": "f49125dac3:352bb35ffrca2:a123dc4599245",
"algorithm": "superhash" # new type from my cybox++
},
]
}
}


Two observations:
1. A context (aka "schema") would be able to extend the vocabulary.
2. Users who want to use an extended vocabulary would have to create (and share!) a new context, if they want others to understand their objects.

How is this different from our current situation with custom vocabularies in XML?
JSA



From:
cti-cybox@lists.oasis-open.org <cti-cybox@lists.oasis-open.org> on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Sent:
Monday, November 2, 2015 11:51 AM
To:
John Anderson
Cc:
Kirillov, Ivan A.; cti-cybox@lists.oasis-open.org
Subject:
Re: [cti-cybox] Re: CybOX 3.0: HashType Refactoring

Most of those @type sections seem totally superfluous to me.

IE - I know the object affiliated with the "file" attribute will be a File type. I do not need you to tell me this.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


John Anderson ---2015/11/02 12:07:40 PM---Ivan, Could some ideas from JSON-LD help us here?


From:
John Anderson <janderson@soltra.com>
To:
"Kirillov, Ivan A." <ikirillov@mitre.org>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Date:
2015/11/02 12:07 PM
Subject:
[cti-cybox] Re: CybOX 3.0: HashType Refactoring
Sent by:
<cti-cybox@lists.oasis-open.org>





Ivan,
Could some ideas from JSON-LD help us here?

Disclaimer: I'm not sure JSON-LD allows embedding objects like this or extending a context, like I've done.

Also, there's a "@vocab" thing in JSON-LD. But once we start using vocabularies, we're heading down the road toward Ontologically-Correct Disunity (OCD).

{
"@context": "
http://cybox.example.com/mycybox++",
"@type": "Observable",
"file" : {
"@type": "File",
"hashes" : [
{
"hash": "3773a88f65a5e780c8dff9cdc3a056f3",
"@type": "md5" # default type defined in CybOX
},
{
"hash": "f49125dac3:352bb35ffrca2:a123dc4599245",
"@type": "superhash" # new type from my cybox++
},
]
}
}

JSA



From:
Kirillov, Ivan A. <ikirillov@mitre.org>
Sent:
Monday, November 2, 2015 10:54 AM
To:
John Anderson; cti-cybox@lists.oasis-open.org
Subject:
Re: CybOX 3.0: HashType Refactoring

It makes sense, and I can definitely see the parallels to the IP Address refactoring :)

My main concern is that if the “type” field is intended to capture a set of default hash types and also support custom values, then it will likely need to use a controlled vocabulary, which gets us back to the original HashType implementation and its corresponding complexity:


{
"file" : {
"hashes" : [
{
"hash": "3773a88f65a5e780c8dff9cdc3a056f3",
"type": {"vocabulary":"HashNameVocab-1.0", "value":”md5"}
},
{
"hash": "f49125dac3:352bb35ffrca2:a123dc4599245",
"type": "superhash" # A "custom" hash type.
},
]
}
}


A possible middle ground is to have the “type” field set to a hard-coded enumeration (with values of “md5”, “sha1”, “sha256” etc.), and have a separate “custom_type” field for custom hash values. This negates the need for a controlled vocabulary driven approach, and thus would still be simpler. I think “custom_type” or “type” would always have to be specified though, as you can’t reliably infer the type of hash from a particular value (although you can make educated guesses – if the value is 16 bytes in length, odds are it’s MD5):


{
"file" : {
"hashes" : [
{
"hash": "3773a88f65a5e780c8dff9cdc3a056f3",
"type": ”md5"
},
{
"hash": "f49125dac3:352bb35ffrca2:a123dc4599245",
"custom_type": "superhash" # A "custom" hash type.
},
]
}
}


What do you think?

Regards,
Ivan


From:
John Anderson
Date:
Monday, November 2, 2015 at 10:19 AM
To:
Ivan Kirillov, "cti-cybox@lists.oasis-open.org"
Subject:
Re: CybOX 3.0: HashType Refactoring

This Hash refactoring seems to parallel the IP Address refactoring. Would it make sense to treat hashes the same way we treat IP Addresses?
By applying that idea to the example on the page, we get something like this:


{
"file" : {
"hashes" : [
{
"hash": "3773a88f65a5e780c8dff9cdc3a056f3",
"type": "md5"
},
{
"hash": "f49125dac3:352bb35ffrca2:a123dc4599245",
"type": "superhash" # A "custom" hash type.
},
{
"hash": "12343773a88f65a5e780c8dff9cdc3a0"
# Default is "md5", if it's not specified.
}
]
}
}


Whadayathink?
JSA



From:
cti-cybox@lists.oasis-open.org <cti-cybox@lists.oasis-open.org> on behalf of Kirillov, Ivan A. <ikirillov@mitre.org>
Sent:
Monday, November 2, 2015 10:07 AM
To:
cti-cybox@lists.oasis-open.org
Subject:
[cti-cybox] CybOX 3.0: HashType Refactoring

All,

As I mentioned on last week’s call, we’ve got another proposal related to CybOX 3.0 to get your feedback on:
https://github.com/CybOXProject/schemas/wiki/CybOX-3.0:-HashType-Refactoring
                  CybOXProject/schemas
                  schemas - CybOX Schemas and Schema Development

                  Read more...


This one is around refactoring the way hashes (especially common ones like MD5 and SHA1) are currently captured. Accordingly, we’d love to get your general thoughts on the proposal as well as on the related questions:
                  1. Does it make sense to have two disparate types for capturing hashes in CybOX, one for more common hashes and one for esoteric/custom hashes?
                  2.
                  As far as the list of hashes in the new HashesType – are there any that are missing? Are there any that should be pruned?
                  3.
                  Are there any fields that should be added to the new CustomHashType?
Regards,
Ivan and Trey











[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]