In some programming languages, files and directories are considered the same object (File) and in order to tell the difference there is an “is_directory()” type of call.
In others (i.e., Python) the object is a “path”, and the path can be either a directory or a file.
Personally, I’d find either scenario workable. Depending on which choices are made, an “is_directory” or “is_file” attribute; or maybe path_type=“(file|directory)”. Or maybe something else, I’m not sure.
To offer an opinion on this point directly:
> One concern I have with this is something defined as a File object that contains only path and no file name being considered a Directory vs. having a Directory object.
I think we should be explicit in terms of what fields mean, both in presence and absence. If absence of a filename means the File object characterizes a directory, there should IMO be a sentence calling it out in the specification (though I think an explicit
property would be preferable). If a File object can never represent a directory, that should be made clear in some way or other.
Thank you.
-Mark
Ivan,
This seems more than reasonable for representing files in the simplest case.
Is it proper to infer with these field definitions that there would not be a field for file_extension since its to be included in the value of file_name, if there is one?
One concern I have with this is something defined as a File object that contains only path and no file name being considered a Directory vs. having a Directory object.
Paul
We’ve updated the proposal to take into account the new file_name field:
Field |
Type |
Multiplicity |
Description |
file_name |
string |
0-1 |
The name of the file, including its extension (if known) but excluding its path. |
file_path |
FilePath |
0-1 |
The path to the file on the file system, excluding its name and extension. If this field is included without the file_name field, the file object instance specifies a directory. |
Does this seem reasonable?
Thanks,
Ivan
Thanks Jason and John! Trey and I will chat about this and update the proposal accordingly.
Regards,
Ivan
From: Jason Keirstead
Date: Friday, November 20, 2015 at 9:01 AM
To: Ivan Kirillov
Cc: Bret Jordan, " cti-cybox@lists.oasis-open.org", John Anderson, Terry MacDonald
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
I agree with this and IMO it's a good approach.
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security | www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
"Kirillov,
Ivan A." ---11/20/2015 08:49:57 AM---Definitely agree that searching by hash/fuzzy hash is a common and useful practice, and the latter i
From: "Kirillov, Ivan A." <ikirillov@mitre.org>
To: "Jordan, Bret" <bret.jordan@bluecoat.com>, Terry MacDonald <terry@soltra.com>
Cc: Jason Keirstead/CanEast/IBM@IBMCA, John Anderson <janderson@soltra.com>, "cti-cybox@lists.oasis-open.org"
<cti-cybox@lists.oasis-open.org>
Date: 11/20/2015 08:49 AM
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
Definitely agree that searching by hash/fuzzy hash is a common and useful practice, and the latter is why SSDEEP is one of the standard values in the HashTypeEnum.
Going back to Jason’s point, tokenizing a file name/path string means that you need to employ some extra logic to understand where the file name actually resides, so that’s well taken. I think we were going back and forth as to whether
we should have a separate field for file name vs. file path, and wanted to avoid the headaches associated with the current object :)
However, I can see the utility in patterning and querying around separate path vs. file name fields, especially in terms of having clear semantics. For instance, a pattern such as:
file.file_name.equals(“f00bar.dll”)
Is clearly written against only the file name. Whereas the following has the potential of matching against a directory with that name:
file.file_path.contains(“f00bar.dll”)
Anyhow, if we do go down this road, this is probably the most I’d want to split the file name/path related fields. Also, unlike the current File Object, file_path would ONLY hold the path element, and would not be permitted to encompass
the file name as well. Does this seem reasonable?
Regards,
Ivan
From: <cti-cybox@lists.oasis-open.org>
on behalf of Bret Jordan
Date: Thursday, November 19, 2015 at 5:06 PM
To: Terry MacDonald
Cc: Jason Keirstead, John Anderson, "cti-cybox@lists.oasis-open.org",
Ivan Kirillov
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
From our experience searching by file hash and a fuzzy hash is really valuable. It is also nice when you can say things like show me a fuzzy hashes that are 80% similar.
Thanks,
Bret
Bret Jordan CISSP
Director of Security Architecture and Standards | Office of the CTO
Blue Coat Systems
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."
On Nov 19, 2015, at 14:57, Terry MacDonald <terry@soltra.com> wrote:
As most malware randomizes names upon each drop, we’ll most likely be searching for it with an MD5/SHA1/SHA256 Hash, or even a ‘fuzzy hash’ of some kind (e.g. ssdeep or any other Context Trigger Piecewise Hashing program)
rather than File Name. Fuzzy hashes are increasingly important as more malware supports polymorphism.
IMHO File Name will be used but won’t be as useful as hashes.
CTPH original paper for those interested.
http://dfrws.org/2006/proceedings/12-Kornblum.pdf
Cheers
Terry MacDonald
Senior STIX Subject Matter Expert
SOLTRA | An FS-ISAC and DTCC Company
+61 (407) 203 206 | terry@soltra.com
From: cti-cybox@lists.oasis-open.org [mailto:cti-cybox@lists.oasis-open.org]
On Behalf Of Jason Keirstead
Sent: Friday, 20 November 2015 7:07 AM
To: John Anderson <janderson@soltra.com>
Cc: cti-cybox@lists.oasis-open.org; Kirillov, Ivan A. <ikirillov@mitre.org>
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
:)
So here is my main point - and maybe I am out on a limb but I don't think I am - when most people will be searching for an IOC using a file object, they will be usually be doing it by file name, *not* an absolute path, because if you are looking for an IOC
it is likely able to manifest itself at many different paths - not to mention the various different top-level portions of a path that would vary from reporter to reporter. This makes searching by path far less likely than by name. So if we accept that assumption
- then we should make it possible to do such a query without resorting to globbing or regex (both of which is expensive).
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security |
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
<image001.gif>John Anderson ---11/19/2015 03:55:42 PM---File globs are friendlier than regex. [�Ì
From: John Anderson <janderson@soltra.com>
To: Jason Keirstead/CanEast/IBM@IBMCA, "Kirillov, Ivan A." <ikirillov@mitre.org>
Cc: "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Date: 11/19/2015 03:55 PM
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
File globs are friendlier than regex. <image002.png>
Some examples: https://github.com/cyberdelia/django-pipeline/issues/208
From: cti-cybox@lists.oasis-open.org <cti-cybox@lists.oasis-open.org>
on behalf of Jason Keirstead <Jason.Keirstead@ca.ibm.com>
Sent: Thursday, November 19, 2015 2:53 PM
To: Kirillov, Ivan A.
Cc: cti-cybox@lists.oasis-open.org
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
- regex searching is extremely expensive over large amounts of data, so we should try to avoid the need for software to do it during design if possible.
- I was more referring to this optional part of the proposal
To make it easier to deal with file names on different operating systems, we believe that it may make sense to have a special type that breaks up the file name/path into a list of delimited components:
FileName
Field
|
Datatype
|
Description
|
delimiter |
string |
The delimiter used in the file name/path string. |
components |
list |
A list of strings that represent the components of the file name/path string, when split using the delimiter specified in the 'delimiter' field. A value of 'null' at the end of the list specifies
a directory. |
If on one system my file is at C:\Windows\explorer.exe and on another it is C:\MyUberBox\Infected\explorer.exe, then on one box the file name is in field "2" and the other in field "3".. this makes it hard to filter on just a file
name...
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security |
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
"Kirillov, Ivan A." ---11/19/2015 03:45:06 PM---That’s a fair point, Jason – my only counter-argument is that most queries such as these can easily
From: "Kirillov, Ivan A." <ikirillov@mitre.org>
To: Jason Keirstead/CanEast/IBM@IBMCA
Cc: "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Date: 11/19/2015 03:45 PM
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
Sent by: <cti-cybox@lists.oasis-open.org>
That’s a fair point, Jason – my only counter-argument is that most queries such as these can easily be expressed with a regular _expression_.
E.g., for "find all observables that <match other params> and are explorer.exe” you’d have:
file_name.regex = "explorer\.exe$”
As far as John’s point about file extensions, I’d completely agree that they’re largely superfluous today. It’s also worth noting that our concept of “extensions” has to do with extending the File Object with context/domain-specific data and NOT with regards
to “.dll”, “.exe” and so forth. Perhaps we need another name for it :)
Regards,
Ivan
From: Jason Keirstead
Date: Thursday, November 19, 2015 at 2:37 PM
To: Ivan Kirillov
Cc: "cti-cybox@lists.oasis-open.org"
Subject: Re: [cti-cybox] CybOX 3.0: File Object Refactoring
My only comment - and I have not decided where I sit on the fence - is that if you remove "file extension" and "file name" properties, and consolidate them all into one value called "path", this will make filtering and QUERY more difficult against your data.
IE
"find all observables that <match other params> and are DLL" or
"find all observables that <match other params> and are explorer.exe"
-
Jason Keirstead
Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security |
www.securityintelligence.com
Without data, all you are is just another person with an opinion - Unknown
"Kirillov, Ivan A." ---11/19/2015 01:20:31 PM---All, As Trey mentioned in his previous email, we’ve been thinking about how to refactor and fix the
From: "Kirillov, Ivan A." <ikirillov@mitre.org>
To: "cti-cybox@lists.oasis-open.org" <cti-cybox@lists.oasis-open.org>
Date: 11/19/2015 01:20 PM
Subject: [cti-cybox] CybOX 3.0: File Object Refactoring
Sent by: <cti-cybox@lists.oasis-open.org>
All,
As Trey mentioned in his previous email, we’ve been thinking about how to refactor and fix the issues associated with the File Object (and its subclasses). Accordingly, we’ve put together a page that outlines the existing issues and our ideas on addressing
them: https://github.com/CybOXProject/schemas/wiki/CybOX-3.0:-File-Object-Refactoring
We’ll be discussing this during today’s call, but we’d love to get your input here (and/or on Slack) as well – generally on your feelings with regards to these changes, but also on:
- Are there any other issues with the File Object and its subclasses that we’re missing?
- Does the concept of domain-specific/context-specific extension points make sense?
·
Are there any other default extensions that we should be adding?
·
Are there any other properties for the default extensions that we should be adding?
Also, we’d like to highlight that we’re still thinking through some of the implications of this approach (how to manage/version/update extensions, etc.), so consider this a living document.
Regards,
Ivan and Trey
---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail. Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
[attachment "graycol.gif" deleted by Jason Keirstead/CanEast/IBM]
|