OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-cybox message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring


I meant ‘file_type’ as a strawman. I agree there is probably a better name for the concept I was suggesting;  primarily I was trying to discuss is_<something>=True/False vs. an explicit statement of what it is.

-Mark

From: "Barnum, Sean D." <sbarnum@mitre.org>
Date: Tuesday, December 15, 2015 at 10:39 AM
To: Mark Davidson <mdavidson@soltra.com>, "Kirillov, Ivan A." <ikirillov@mitre.org>, John Anderson <janderson@soltra.com>, Jerome Athias <athiasjerome@gmail.com>, Paul Patrick <ppatrick@isightpartners.com>
Cc: Jason Keirstead <Jason.Keirstead@ca.ibm.com>, "Jordan, Bret" <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, Terry MacDonald <terry@soltra.com>
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring

I am all for being as explicit as we can though worry about using a field called “file_type” both given how overloaded the term “type” is and the fact that “file_type” is very likely to get confused with MIMEType or content_type.

sean

From: <cti-cybox@lists.oasis-open.org> on behalf of Mark Davidson <mdavidson@soltra.com>
Date: Tuesday, December 15, 2015 at 10:35 AM
To: Steve Cell <ikirillov@mitre.org>, John Anderson <janderson@soltra.com>, Jerome Athias <athiasjerome@gmail.com>, "ppatrick@isightpartners.com" <ppatrick@isightpartners.com>
Cc: Jason Keirstead <Jason.Keirstead@ca.ibm.com>, "Jordan, Bret" <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, Terry MacDonald <terry@soltra.com>
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring

Totally open ended question: could a file object ever, now or in the future, represent anything other than a “File” or a “Directory”? If so, I would want to consider something like file_type=“(file|directory)” because then the values of “file_type” can be extended for these additional values in the future. I also feel like this notation would be _slightly_ more explicit.

That said, I am 100% supportive of the proposed structure; my comment above is a possible refinement of the structure.

Thank you.
-Mark

From: <cti-cybox@lists.oasis-open.org> on behalf of "Kirillov, Ivan A." <ikirillov@mitre.org>
Date: Tuesday, December 15, 2015 at 10:23 AM
To: John Anderson <janderson@soltra.com>, Jerome Athias <athiasjerome@gmail.com>, Paul Patrick <ppatrick@isightpartners.com>
Cc: Jason Keirstead <Jason.Keirstead@ca.ibm.com>, "Jordan, Bret" <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, Terry MacDonald <terry@soltra.com>
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring

+1 for MIMEType as well – I think this would be semantically less ambiguous than “content type”, and so it would be my preference. This would likely be a property that we would add into the default “file metadata” extension; I’ll update the proposal accordingly. There are likely other properties that would fit in here as well – things like entropy. Is there a sense in the community as far as other common file metadata related properties we should be including?

As far as characterizing directories, as mentioned in the writeup below, the current plan is allow for this through the use of the file_path field without the file_name field. E.g, the following would be a directory:

{ 
  "file_system_properties":{"file_path": {"delimiter":"\\", 
                                          "components":["C:","windows"]}}
}

This goes along with the notion, as Mark pointed out, that files and directories are treated the same in many languages and also operating systems. However, Paul has a good point that this is less explicit than having a separate directory object. We’ve thought about this in the past and the discussion has always come back to the fact that directories are usually analogous to files in most places, just not in Windows. Therefore, perhaps what we can do is: 
  1. Add extensions for directory-specific properties (likely just for Windows 
  2. To make it more explicit that you’re characterizing a directory, add an “is_directory” boolean property
{ 
  "file_system_properties":{is_directory": True,
                            "file_path": {"delimiter":"\\", 
                                          "components":["C:","windows"]}}
}
What do you think?

Regards,
Ivan
From: John Anderson <janderson@soltra.com>
Date: Tuesday, December 15, 2015 at 7:09 AM
To: Jerome Athias <athiasjerome@gmail.com>, Paul Patrick <ppatrick@isightpartners.com>
Cc: Jason Keirstead <Jason.Keirstead@ca.ibm.com>, Ivan Kirillov <ikirillov@mitre.org>, Bret Jordan <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, Terry MacDonald <terry@soltra.com>
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring

+1 for having an attribute that holds a MIME Type value. (And maybe "mimetype" is the right attribute name.)


Random use-case: An executable that has a ".txt" extension is still executable on Linux, if the right bits are set. If the MIME type is known, then that might make it easier for automated systems to pay attention.


JSA


From: Jerome Athias <athiasjerome@gmail.com>
Sent: Tuesday, December 15, 2015 8:28 AM
To: Paul Patrick
Cc: Jason Keirstead; Kirillov, Ivan A.; Jordan, Bret; cti-cybox@lists.oasis-open.org; John Anderson; Terry MacDonald
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
 
MIMEType is used in Malware Metadata Exchange Format (MMDEF), which is used in MAEC
Ref. https://standards.ieee.org/develop/indconn/icsg/mmdef.html
standards.ieee.org
The Malware Metadata Exchange Format (MMDEF) Working Group works on expanding the breadth of information able to be captured and shared about malware in a ...


2015-12-15 16:19 GMT+03:00 Paul Patrick <ppatrick@isightpartners.com>:
Interesting thought.  When thinking about this from a malware analysis point of view, having a content_type attribute would actually be very useful.


Paul Patrick


From: <cti-cybox@lists.oasis-open.org> on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Date: Tuesday, December 15, 2015 at 8:00 AM
To: Ivan Kirillov <ikirillov@mitre.org>
Cc: "Jordan, Bret" <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, John Anderson <janderson@soltra.com>, Terry MacDonald <terry@soltra.com>
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring

I am wondering if people other than myself see a value in including either a content_type or mime_type attribute.

It is not safe (nor always possible) to assume the content type of a file based on it's 3 letter extension.

By including content_type, we would allow query mechanisms operating over cybox to make a pattern that matches any zip file.

Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


Inactive hide details for "Kirillov, Ivan A." ---12/14/2015 06:26:31 PM---We’ve updated the proposal to take into account the "Kirillov, Ivan A." ---12/14/2015 06:26:31 PM---We’ve updated the proposal to take into account the new file_name field: Field Type Multiplicit

From: "Kirillov, Ivan A." <ikirillov@mitre.org>
To: Jason Keirstead/CanEast/IBM@IBMCA
Cc: "Jordan, Bret" <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, "John Anderson" <janderson@soltra.com>, Terry MacDonald <terry@soltra.com>
Date: 12/14/2015 06:26 PM
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
Sent by: <cti-cybox@lists.oasis-open.org>





We’ve updated the proposal to take into account the new file_name field:

Field
Type
Multiplicity
Description
file_name string 0-1 The name of the file, including its extension (if known) but excluding its path.
file_path FilePath 0-1 The path to the file on the file system, excluding its name and extension. If this field is included without the file_name field, the file object instance specifies a directory.
Does this seem reasonable?

Thanks,
Ivan

From: <cti-cybox@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org>
Date:
Friday, November 20, 2015 at 7:38 AM
To:
Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Cc:
Bret Jordan <bret.jordan@bluecoat.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>, John Anderson <janderson@soltra.com>, Terry MacDonald <terry@soltra.com>
Subject:
Re: [cti-cybox] CybOX 3.0: File Object Refactoring

Thanks Jason and John! Trey and I will chat about this and update the proposal accordingly.

Regards,
Ivan

From: Jason Keirstead
Date:
Friday, November 20, 2015 at 9:01 AM
To:
Ivan Kirillov
Cc:
Bret Jordan, "cti-cybox@lists.oasis-open.org", John Anderson, Terry MacDonald
Subject:
Re: [cti-cybox] CybOX 3.0: File Object Refactoring

I agree with this and IMO it's a good approach.

-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com

Without data, all you are is just another person with an opinion - Unknown


Inactive hide details for "Kirillov, Ivan A." ---11/20/2015 08:49:57 AM---Definitely agree that searching by hash/fuzzy hash is"Kirillov, Ivan A." ---11/20/2015 08:49:57 AM---Definitely agree that searching by hash/fuzzy hash is a common and useful practice, and the latter i

From:
"Kirillov, Ivan A." <ikirillov@mitre.org>
To:
"Jordan, Bret" <bret.jordan@bluecoat.com>, Terry MacDonald <terry@soltra.com>
Cc:
Jason Keirstead/CanEast/IBM@IBMCA, John Anderson <janderson@soltra.com>, "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Date:
11/20/2015 08:49 AM
Subject:
Re: [cti-cybox] CybOX 3.0: File Object Refactoring





Definitely agree that searching by hash/fuzzy hash is a common and useful practice, and the latter is why SSDEEP is one of the standard values in the HashTypeEnum.

Going back to Jason’s point, tokenizing a file name/path string means that you need to employ some extra logic to understand where the file name actually resides, so that’s well taken. I think we were going back and forth as to whether we should have a separate field for file name vs. file path, and wanted to avoid the headaches associated with the current object :)

However, I can see the utility in patterning and querying around separate path vs. file name fields, especially in terms of having clear semantics. For instance, a pattern such as:

file.file_name.equals(“f00bar.dll”)

Is clearly written against only the file name. Whereas the following has the potential of matching against a directory with that name:

file.file_path.contains(“f00bar.dll”)

Anyhow, if we do go down this road, this is probably the most I’d want to split the file name/path related fields. Also, unlike the current File Object, file_path would ONLY hold the path element, and would not be permitted to encompass the file name as well. Does this seem reasonable?

Regards,
Ivan

From:
<cti-cybox@lists.oasis-open.org> on behalf of Bret Jordan
Date:
Thursday, November 19, 2015 at 5:06 PM
To:
Terry MacDonald
Cc:
Jason Keirstead, John Anderson, "cti-cybox@lists.oasis-open.org", Ivan Kirillov
Subject:
Re: [cti-cybox] CybOX 3.0: File Object Refactoring

From our experience searching by file hash and a fuzzy hash is really valuable. It is also nice when you can say things like show me a fuzzy hashes that are 80% similar.



Thanks,


Bret




Bret Jordan CISSP

Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems

PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."
          On Nov 19, 2015, at 14:57, Terry MacDonald <terry@soltra.com> wrote:

          As most malware randomizes names upon each drop, we’ll most likely be searching for it with an MD5/SHA1/SHA256 Hash, or even a ‘fuzzy hash’ of some kind (e.g. ssdeep or any other Context Trigger Piecewise Hashing program) rather than File Name. Fuzzy hashes are increasingly important as more malware supports polymorphism.


          IMHO File Name will be used but won’t be as useful as hashes.


          CTPH original paper for those interested.

          http://dfrws.org/2006/proceedings/12-Kornblum.pdf

          Cheers


          Terry MacDonald

          Senior STIX Subject Matter Expert

          SOLTRA
          | An FS-ISAC and DTCC Company
          +61 (407) 203 206 |
          terry@soltra.com


          From:
          cti-cybox@lists.oasis-open.org [mailto:cti-cybox@lists.oasis-open.org] On Behalf Of Jason Keirstead
          Sent:
          Friday, 20 November 2015 7:07 AM
          To:
          John Anderson <janderson@soltra.com>
          Cc:
          cti-cybox@lists.oasis-open.org; Kirillov, Ivan A. <ikirillov@mitre.org>
          Subject:
          Re: [cti-cybox] CybOX 3.0: File Object Refactoring

          :)

          So here is my main point - and maybe I am out on a limb but I don't think I am - when most people will be searching for an IOC using a file object, they will be usually be doing it by file name, *not* an absolute path, because if you are looking for an IOC it is likely able to manifest itself at many different paths - not to mention the various different top-level portions of a path that would vary from reporter to reporter. This makes searching by path far less likely than by name. So if we accept that assumption - then we should make it possible to do such a query without resorting to globbing or regex (both of which is expensive).

          -
          Jason Keirstead
          Product Architect, Security Intelligence, IBM Security Systems

          www.ibm.com/security | www.securityintelligence.com

          Without data, all you are is just another person with an opinion - Unknown


          <image001.gif>
          John Anderson ---11/19/2015 03:55:42 PM---File globs are friendlier than regex. [Ì

          From:
          John Anderson <janderson@soltra.com>
          To:
          Jason Keirstead/CanEast/IBM@IBMCA, "Kirillov, Ivan A." <ikirillov@mitre.org>
          Cc:
          "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
          Date:
          11/19/2015 03:55 PM
          Subject:
          Re: [cti-cybox] CybOX 3.0: File Object Refactoring






          File globs are friendlier than regex.
          <image002.png>

          Some examples:
          https://github.com/cyberdelia/django-pipeline/issues/208




          From:
          cti-cybox@lists.oasis-open.org <cti-cybox@lists.oasis-open.org> on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
          Sent:
          Thursday, November 19, 2015 2:53 PM
          To:
          Kirillov, Ivan A.
          Cc:
          cti-cybox@lists.oasis-open.org
          Subject:
          Re: [cti-cybox] CybOX 3.0: File Object Refactoring

          - regex searching is extremely expensive over large amounts of data, so we should try to avoid the need for software to do it during design if possible.

          - I was more referring to this optional part of the proposal
                          To make it easier to deal with file names on different operating systems, we believe that it may make sense to have a special type that breaks up the file name/path into a list of delimited components:
                          FileName
                          Field
                          Datatype
                          Description
                          delimiter string The delimiter used in the file name/path string.
                          components list A list of strings that represent the components of the file name/path string, when split using the delimiter specified in the 'delimiter' field. A value of 'null' at the end of the list specifies a directory.
          If on one system my file is at C:\Windows\explorer.exe and on another it is C:\MyUberBox\Infected\explorer.exe, then on one box the file name is in field "2" and the other in field "3".. this makes it hard to filter on just a file name...



          -
          Jason Keirstead
          Product Architect, Security Intelligence, IBM Security Systems

          www.ibm.com/security | www.securityintelligence.com

          Without data, all you are is just another person with an opinion - Unknown


          "Kirillov, Ivan A." ---11/19/2015 03:45:06 PM---That’s a fair point, Jason – my only counter-argument is that most queries such as these can easily


          From:
          "Kirillov, Ivan A." <ikirillov@mitre.org>
          To:
          Jason Keirstead/CanEast/IBM@IBMCA
          Cc:
          "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
          Date:
          11/19/2015 03:45 PM
          Subject:
          Re: [cti-cybox] CybOX 3.0: File Object Refactoring
          Sent by:
          <cti-cybox@lists.oasis-open.org>





          That’s a fair point, Jason – my only counter-argument is that most queries such as these can easily be expressed with a regular _expression_.

          E.g., for "find all observables that <match other params> and are explorer.exe” you’d have:

          file_name.regex = "explorer\.exe$”

          As far as John’s point about file extensions, I’d completely agree that they’re largely superfluous today. It’s also worth noting that our concept of “extensions” has to do with extending the File Object with context/domain-specific data and NOT with regards to “.dll”, “.exe” and so forth. Perhaps we need another name for it :)

          Regards,
          Ivan


          From:
          Jason Keirstead
          Date:
          Thursday, November 19, 2015 at 2:37 PM
          To:
          Ivan Kirillov
          Cc:
          "cti-cybox@lists.oasis-open.org"
          Subject:
          Re: [cti-cybox] CybOX 3.0: File Object Refactoring
          My only comment - and I have not decided where I sit on the fence - is that if you remove "file extension" and "file name" properties, and consolidate them all into one value called "path", this will make filtering and QUERY more difficult against your data.

          IE

          "find all observables that <match other params> and are DLL" or
          "find all observables that <match other params> and are explorer.exe"




          -
          Jason Keirstead
          Product Architect, Security Intelligence, IBM Security Systems

          www.ibm.com/security | www.securityintelligence.com

          Without data, all you are is just another person with an opinion - Unknown


          "Kirillov, Ivan A." ---11/19/2015 01:20:31 PM---All, As Trey mentioned in his previous email, we’ve been thinking about how to refactor and fix the


          From:
          "Kirillov, Ivan A." <ikirillov@mitre.org>
          To:
          "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
          Date:
          11/19/2015 01:20 PM
          Subject:
          [cti-cybox] CybOX 3.0: File Object Refactoring
          Sent by:
          <cti-cybox@lists.oasis-open.org>





          All,

          As Trey mentioned in his previous email, we’ve been thinking about how to refactor and fix the issues associated with the File Object (and its subclasses). Accordingly, we’ve put together a page that outlines the existing issues and our ideas on addressing them:
          https://github.com/CybOXProject/schemas/wiki/CybOX-3.0:-File-Object-Refactoring

          We’ll be discussing this during today’s call, but we’d love to get your input here (and/or on Slack) as well – generally on your feelings with regards to these changes, but also on:
                                                                        • Are there any other issues with the File Object and its subclasses that we’re missing?
                                                                        • Does the concept of domain-specific/context-specific extension points make sense?
                                                                                                                                          · Are there any other default extensions that we should be adding?
                                                                                                                                          ·
                                                                                                                                          Are there any other properties for the default extensions that we should be adding?
          Also, we’d like to highlight that we’re still thinking through some of the implications of this approach (how to manage/version/update extensions, etc.), so consider this a living document.

          Regards,
          Ivan and Trey


          ---------------------------------------------------------------------
          To unsubscribe from this mail list, you must leave the OASIS TC that
          generates this mail. Follow this link to all your TCs in OASIS at:

          https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
          [attachment "graycol.gif" deleted by Jason Keirstead/CanEast/IBM]

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
[attachment "graycol.gif" deleted by Jason Keirstead/CanEast/IBM]



---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]