OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-cybox message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-cybox] Recent CybOX Changes


On 7/28/2016 3:56 PM, Kirillov, Ivan A. wrote:
Added File Path Type to Common Object Types (8.1.4.2) so that it can be re-used for file paths as needed in the various CybOX Objects

I'm still not clear on what value of a separate file-path-type is. The only thing that immediately comes to mind is being able to write patterns that match a specific path component vs. relying entirely on regex to match substrings in a file path. I'm not sure if that is worth the additional complexity.

Based on comments and discussions, removed Object Property Metadata section and instead added String with Encoding Type to Common Object Types (8.1.4.2). This type permits the capture of observed encodings for strings in Objects wherever appropriate (see example: https://docs.google.com/document/d/1DdS-NrVTjGJ3wvCJ7dbSlhYeiaWS6G6dOXu2F3POpUs/edit#heading=h.47ju1z5ea7t). Accordingly, updated the type definitions throughout the CybOX Objects to be an OR between a string and this new type wherever it made sense. We realize that this may complicate parsing (e.g., having to distinguish between strings and objects) and creation of CybOX data so we look forward to your feedback.

I'm concerned about the increased complexity during parsing. In contrast to combining "encoding" with "base64_value" (which is needed when a character sequence cannot be represented in UTF-8, and *required* in order to determine the "natural" or "native" representative of the string), using "encoding" with "value" is really just "ancillary" information ("this is the encoding I saw this string as, before converting it to UTF-8"). In other words, you can safely ignore the "encoding" field in the latter case if you don't care about it, but not in the former case. Combined with needing to distinguish between a string and and object when parsing, this is a lot of additional effort for *every* field where that choice is provided

I know this is a tough problem, and it's possible that this is the best solution. But the idea of writing code to support this does not make me excited.

Moved magic_number from File Metadata Extension to base File Object, since it is analogous to mime_type which was already on the base. Accordingly, renamed File Metadata to File Metadata Mismatch and removed redundant has_mismatch field. However, a point was raised about this particular extension, namely that it represents an assertion rather than a “fact” such as a magic number or hash. Accordingly, we need to consider the question of whether such assertions belong in CybOX or not.

As one of the people who raised this point, I'd also like to add that I've never seen magic_number well-defined and is usually (in my experience) inconsistently specified. There are some resources online, but I don't think we should necessarily incorporate those by reference or require people to learn about well- and less-well-known magic numbers.

Also, the mime_type for various files (as reported by the "file" utility via libmagic) is not necessarily stable between versions. Hopefully it's pretty uniform for common file types, but I'm worried that mime_type is less "fact" and more "assertion by a specific tool at a specific time".

I realize that file extension by itself is easily spoofed, and that noting when an extension doesn't match the file content is incredibly significant in the CTI domain. But I can't think of a better way to capture this information.

Greg


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]