OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-cybox message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-cybox] Recent CybOX Changes


Thanks for the feedback, all. 

As far as John’s proposal on encoding – it looks like JSON does preserve ordering in lists so that may be a workable solution, though I wonder if the same would be true for other serializations (not that it’s a big concern now). We also considered adding text to the dictionary fields to state that you can also specify encoding, though it seemed like it might be a lot of normative text to write for a field. We similarly thought about having the “_enc” approach be implicit and documented in the core spec and not the object tables, but this also seemed to us to be more difficult to interpret as an implementer, and our general approach has been to trend towards explicit vs. implicit specifications. 

One other thing worth considering is that the “_enc” approach means that we need to have an a priori understanding of the fields that we want to capture observed encoding for.  Using the string-with-encoding-type anywhere there’s a potential need for capturing observed encoding makes this process much simpler, and means that there’s less of a chance that we need to add a corresponding “_enc” field in the future. 

Also, Trey, Allan, and I briefly talked about this earlier today and we thought it might be better to remove the “OR” between string and string-with-encoding-type in those instances where we thought observed encoding could be captured, and instead just force those fields to use string-with-encoding-type. This means that you’ll always specify and parse those fields as an object, leading to less branches in your code. We’ve made this change in the CybOX Objects – let us know what you think (is it an improvement?). 

As far as file path, are there other thoughts on this as far as the preferable approach (single string vs. list of path component strings)? Trey and I are a little hesitant to change our current implementation because it seemed like we had consensus on this when we were creating and reviewing the File Object. That said, if the group feels that having a single string for representation of file paths is enough, we’ll be happy to go in that direction.

Regards,
Ivan

On 7/29/16, 5:00 PM, "cti-cybox@lists.oasis-open.org on behalf of Allan Thomson" <cti-cybox@lists.oasis-open.org on behalf of athomson@lookingglasscyber.com> wrote:

>Maybe I’m missing something on this conversation or it has been proposed before and dismissed as problematic but…
>
>Why not
>
>{
>    “paths”: { “utf8”, [ “something”, here” ] }
>}
>
>so the paths attribute/object includes the encoding of the contained sub-object within itself. That way when you read the paths object you get the encoding immediately instead of having to search the rest of the JSON attributes looking for an encoding attribute.
>
>Allan
>
>
>On 7/29/16, 3:23 PM, "cti-cybox@lists.oasis-open.org on behalf of Wunder, John A." <cti-cybox@lists.oasis-open.org on behalf of jwunder@mitre.org> wrote:
>
>    I agree w/ Greg on both these items. In general I feel like the simpler the CybOX objects can be the better, even if it means not re-using some things.
>
>    On encoding specifically…it’s not clear to me why the _enc approach doesn’t work across all cases. For arrays you could just have a corresponding _enc array where each item describes the encoding of the array. For objects with subkeys could have each subkey have the corresponding value. For objects that you don’t define (i.e. you reference some spec to define the keys) you could just define how to add the _enc keys (take the property name, add _enc).
>
>    {
>      “paths”: [“something”, “here”],
>      “paths_enc”: [“utf8”, “utf16”]
>    }
>
>    Also I don’t think you need to explicitly define the _enc keys in each of your property tables. IMO you could just have a section on string encoding and say that each key of type “string” MAY have a corresponding _enc key. For lists of strings, say the same.
>
>    I do understand this is a hard problem and the _enc approach might be harder for people to grasp when it comes to arrays and objects. It seems to me though the encoding use case is the minority use case and is a more complicated capability. Thus I’m OK with that additional complexity to make the simple 80% use case much easier.
>
>    John
>
>    On 7/29/16, 3:30 PM, "cti-cybox@lists.oasis-open.org on behalf of Greg Back" <cti-cybox@lists.oasis-open.org on behalf of gback@mitre.org> wrote:
>
>        Thanks for the response, Ivan. Further discussion inline.
>
>        On 7/29/2016 12:46 PM, Kirillov, Ivan A. wrote:
>        > I think our rationale when we implemented this (it’s been a while!) was that it made it more flexible to specify paths across different operating systems and could be useful in patterning as well. We did an example to the patterning spec showing how patterns work against the current file-path-type [1]. This is a generic example against a full path, but one could see how it could be useful for matching against a specific part (e.g., the beginning or end) of a file path. That said, you could also do this with a regex, so I think we can reconsider going back to a single-string based file path.
>
>        I don't understand "more flexible to specify paths across different
>        operating systems", unless you mean you want to use patterns to match
>        just the components but not the delimiter between them. (i.e. either
>        "my/directory" or "my\directory"). I know the "use regular expressions;
>        now you have two problems" maxim, but in this case I think it's
>        appropriate, especially since the pattern matching syntax already
>        supports regular expressions. Having an example for matching a subset of
>        path components would be helpful, since for the current example, the
>        following is much simpler:
>
>        file-object:file_system_properties.file_path =
>        'C:\Windows\System32\foo.dll'
>
>        If you're trying to match a couple parent directories i.e.
>        </some/arbitrary/path/with/unknown/depth>/parent/child.txt, I'm not sure
>        it's even possible. In other words, how do you say "a file named
>        child.txt in a directory named parent"?
>
>        > Yeah, I think the additional parsing complexity is definitely the biggest problem with this approach. That said, is it really that difficult to test whether something is a string or an object when parsing a field? At this point, this does seem like the best solution that we’ve able to put forth, because it means that content producers who don’t care/know about observed encoding (which will likely be most) will always just specify a string, while those who need observed encoding can specify the corresponding object.
>
>        Making this more flexible for producers makes it more difficult for
>        consumers (see: Postel's Law).
>
>        It's not that difficult, but it's the difference between (for each
>        property that has the option):
>
>             file_name = file_obj['file_system_properties']['name']
>
>        and:
>
>             name = file_obj['file_system_properties']['name']
>             if name is dictionary:
>                file_name = name.value
>             else:
>                file_name = name
>
>        Sure, you can wrap that in a reusable function, but it's still a level
>        of indirection.
>
>        > Also, it’s worth noting that one of reasons we’ve been trying to fit this into the MVP release is that if we push this back to a later release it will likely require significant changes to CybOX Core, the existing Object data models, or both.
>
>        I still think I prefer the "_enc" approach. It can be added to any field
>        as needed in future versions, and code which does not care can simply
>        ignore it.
>
>        Greg
>
>        ---------------------------------------------------------------------
>        To unsubscribe from this mail list, you must leave the OASIS TC that
>        generates this mail.  Follow this link to all your TCs in OASIS at:
>        https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
>
>
>
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]