OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-cybox message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-cybox] Recent CybOX Changes


I agree w/ Greg on both these items. In general I feel like the simpler the CybOX objects can be the better, even if it means not re-using some things.

On encoding specifically…it’s not clear to me why the _enc approach doesn’t work across all cases. For arrays you could just have a corresponding _enc array where each item describes the encoding of the array. For objects with subkeys could have each subkey have the corresponding value. For objects that you don’t define (i.e. you reference some spec to define the keys) you could just define how to add the _enc keys (take the property name, add _enc).

{
  “paths”: [“something”, “here”],
  “paths_enc”: [“utf8”, “utf16”]
}

Also I don’t think you need to explicitly define the _enc keys in each of your property tables. IMO you could just have a section on string encoding and say that each key of type “string” MAY have a corresponding _enc key. For lists of strings, say the same.

I do understand this is a hard problem and the _enc approach might be harder for people to grasp when it comes to arrays and objects. It seems to me though the encoding use case is the minority use case and is a more complicated capability. Thus I’m OK with that additional complexity to make the simple 80% use case much easier.

John

On 7/29/16, 3:30 PM, "cti-cybox@lists.oasis-open.org on behalf of Greg Back" <cti-cybox@lists.oasis-open.org on behalf of gback@mitre.org> wrote:

    Thanks for the response, Ivan. Further discussion inline.
    
    On 7/29/2016 12:46 PM, Kirillov, Ivan A. wrote:
    > I think our rationale when we implemented this (it’s been a while!) was that it made it more flexible to specify paths across different operating systems and could be useful in patterning as well. We did an example to the patterning spec showing how patterns work against the current file-path-type [1]. This is a generic example against a full path, but one could see how it could be useful for matching against a specific part (e.g., the beginning or end) of a file path. That said, you could also do this with a regex, so I think we can reconsider going back to a single-string based file path.
    
    I don't understand "more flexible to specify paths across different 
    operating systems", unless you mean you want to use patterns to match 
    just the components but not the delimiter between them. (i.e. either 
    "my/directory" or "my\directory"). I know the "use regular expressions; 
    now you have two problems" maxim, but in this case I think it's 
    appropriate, especially since the pattern matching syntax already 
    supports regular expressions. Having an example for matching a subset of 
    path components would be helpful, since for the current example, the 
    following is much simpler:
    
    file-object:file_system_properties.file_path = 
    'C:\Windows\System32\foo.dll'
    
    If you're trying to match a couple parent directories i.e. 
    </some/arbitrary/path/with/unknown/depth>/parent/child.txt, I'm not sure 
    it's even possible. In other words, how do you say "a file named 
    child.txt in a directory named parent"?
    
    > Yeah, I think the additional parsing complexity is definitely the biggest problem with this approach. That said, is it really that difficult to test whether something is a string or an object when parsing a field? At this point, this does seem like the best solution that we’ve able to put forth, because it means that content producers who don’t care/know about observed encoding (which will likely be most) will always just specify a string, while those who need observed encoding can specify the corresponding object.
    
    Making this more flexible for producers makes it more difficult for 
    consumers (see: Postel's Law).
    
    It's not that difficult, but it's the difference between (for each 
    property that has the option):
    
         file_name = file_obj['file_system_properties']['name']
    
    and:
    
         name = file_obj['file_system_properties']['name']
         if name is dictionary:
            file_name = name.value
         else:
            file_name = name
    
    Sure, you can wrap that in a reusable function, but it's still a level 
    of indirection.
    
    > Also, it’s worth noting that one of reasons we’ve been trying to fit this into the MVP release is that if we push this back to a later release it will likely require significant changes to CybOX Core, the existing Object data models, or both.
    
    I still think I prefer the "_enc" approach. It can be added to any field 
    as needed in future versions, and code which does not care can simply 
    ignore it.
    
    Greg
    
    ---------------------------------------------------------------------
    To unsubscribe from this mail list, you must leave the OASIS TC that 
    generates this mail.  Follow this link to all your TCs in OASIS at:
    https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
    
    



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]