[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [cti-users] STIX 2.0 Pattern Expressiveness
Nick Dimiduk wrote this message on Wed, Dec 28, 2016 at 14:03 -0800: > Thanks for the comments. I've spent some time with the antlr grammar [0], > which has answered some questions and introduced some others. Just so you know, the antlr grammar is not normative. If grammar disagrees w/ the spec, then the grammar is wrong and needs to be fixed... > Part 3 Section 2 defines a bunch of data types that are not represented in > the grammar (list, open-vocab, timestamp, binary, hex, dictionary). > Obviously some of them could be specified a type-name aliases of others. Is > there a plan to update the grammar with support for these types? No... open-vocab is just a string... timestamp is defined now... binary and hex are interchangeable (see 4.2.3: https://docs.google.com/document/d/1suvd7z7YjNKWOwgko-vJ84jfGuxSYZjOQlw5leCswPY/edit#heading=h.hwcrgiy40ia0 ), and dictonaries are already addressed by the specification... > Has anyone discussed making the square brackets optional? For extremely > simple patterns, this would be a nice convenience for users. I think it can > be expressed in the grammar in an unambiguous manner. No, not really, and now that we have AND/OR at the higher level, I'm afraid that would cause too much confusion... > On Fri, Dec 23, 2016 at 8:05 AM, Kirillov, Ivan A. <ikirillov@mitre.org> > wrote: > > > Hi Nick, > > > > > > > > Thanks for the great feedback on STIX Patterning! Overall, I think we had > > considered many of the points that you’ve raised, and pushed back on them > > to focus on a “minimum viable product” release of STIX Patterning that > > would be useful for the vast majority of basic patterns (i.e., those seen > > in the wild today). However, I think many of these are great topics for a > > future release of STIX Patterning. > > > > > > > > (1), (2): This isn’t something we’ve considered, though I agree that it’s > > a useful and likely necessary capability. I think we could probably use the > > same syntax for testing for absent dictionary keys and Object Paths. I > > rather like the "[file:size != nil]" syntax that you’ve proposed for this > > purpose. > > > > > > > > (3): This was done intentionally, as we felt that for the initial > > patterning release it would be simpler and more consistent to have the > > Object Path always on the LHS and literal value on the RHS. However, I > > think it’s likely that in a future release we will allow Object Paths on > > the RHS as well. > > > > > > > > (4): Good point here, I can see how the divergence between dictionary and > > list type lookup syntax is odd. I’m wondering if we should just use square > > bracket notation for dictionary values (as in Python) as well? E.g., > > “file:hashes[MD5]” instead of “file:hashes.MD5”. This would also allow us > > to use an equivalent syntax for ANY. > > > > > > > > (5) I think we can consider adding support for something like ALL, SOME, > > and EXISTS in a future release. > > > > > > > > (6) As you mentioned, right now we can support value constraints using AND > > with the same property. If there are indeed constraints that we can’t > > express using this notation (maybe for timestamps?), then I think adding > > something like a BETWEEN or INRANGE operator makes sense. > > > > > > > > (7), (8): We’ve thought a bit about defining function in the language and > > decided to postpone them to a future release. Most of our discussion has > > been on functions for casting back and forth between constant types (e.g., > > hex -> integer), but other types of functions for primitive types > > definitely makes sense. As far as user-defined functions, this isn’t > > something we’ve discussed, and while I see the utility in them I think they > > would also have the potential to make patterning much more complex, > > especially for implementers/consumers. That said, I think it’s an > > interesting idea and one worth discussing amongst our community. > > > > > > > > (9), (10): We’ve briefly discussed aggregator functions and I think it’s > > certainly something we can add once we incorporate functions in general. As > > far as arbitrary mathematical expressions, they are currently not legal, > > though this is also something that we’ll likely add in a later release as > > well (we actually had them in an early draft and decided to remove them for > > the sake of simplicity). > > > > > > > > (11): This has also been discussed, and will likely be implemented in a > > future release. One possibility we’ve floated for such a capability is to > > add the ability to define variables and accordingly substitute them in > > Object Paths. E.g., [{0} = “foo.dll” AND file:name = {0}] ALONGWITH > > [win-registry-key:key MATCHES {0}]. > > > > > > > > Regards, > > > > Ivan Kirillov > > > > Cyber Observables SC Co-chair > > > > > > > > *From: *<cti-users@lists.oasis-open.org> on behalf of Nick Dimiduk < > > ndimiduk@gmail.com> > > *Date: *Thursday, December 22, 2016 at 2:35 PM > > *To: *"cti-users@lists.oasis-open.org" <cti-users@lists.oasis-open.org> > > *Subject: *[cti-users] STIX 2.0 Pattern Expressiveness > > > > > > > > Hello, > > > > > > > > I'm new to STIX and I've been evaluating the use of STIX 2.0RC3 Patterns > > ([0]) for some use-cases. I find it to be quite a powerful tool. However, > > there are a couple concepts I don't know how to express. I'm hoping the > > community might be able to help me out -- either there's a usage I don't > > see (missed in my reading) or there's an oversight in the language. For any > > of the latter, I hope I'm not too late for my suggestions to be considered > > for 2.0 timelines. > > > > > > > > (1) Dictionary key absence. For instance, looking for some network-traffic > > property AND the absence of a specific HTTP header. Do I treat the > > dictionary as a collection and use NOT IN comparison operator? That seems > > to violate the grammar rules, which say the LHS (left-hand side) of an IN > > clause is the Object Path and the RHS (right-hand side) is a set of > > constant values. > > > > > > > > (2) Related to (1), the same syntax question is raised for optional Object > > Paths (properties) which are not present. For example, a path like > > "file:size" describes an optional property of the File Object. How to check > > for (the absence of) that field? Is there a "null/nil/None" object value > > that can be checked for? What's the syntax for the check? What primitive > > types support it? Some candidate syntax comes to mind: "[file:size != nil]" > > or maybe "[file:size IS NOT null]". Where is that discussed in the spec? > > > > > > > > (3) Also related to (1), the comparison expression is always Object Path > > on the LHS, literal value on the RHS. This is inflexible, and means there's > > no way to compare two Object Paths to each other. It also means I cannot > > check to see if Object Path A is present in Object Path B where B is a > > collection. > > > > > > > > (4) Speaking of collections, collection object lookup syntax is divergent. > > For list types (Part 4, Section 5.2), we have a 0-based index with > > square-brackets ('[]'). We also have a convenience syntax of > > "list_property[*]" as syntactic sugar for the logical ANY operator from > > SQL. However, dictionary types (Part 4, Section 5.3) are referenced with > > "dot-notation", just like any object property. This obscures the property's > > type for a casual reader and restricts (or at least confuses) the body of > > syntax available for expressions on dictionary elements. For example, > > there's no ANY equivalent for matching dictionary members like we can for > > lists. > > > > > > > > (5) Related to (4), there appears to be no syntax for the other > > collection-based logical operators provided by SQL -- ALL, SOME, EXISTS. As > > mentioned in (2), (3), there is IN syntax, but it's not available for > > Object Path element collections, only constant set literals. > > > > > > > > (6) While talking about logical operators, I haven't noticed the > > equivalent of a BETWEEN expression. One must say, for example > > "[type:property > A AND type:property < B]". I haven't thought in depth on > > this topic, could be other non-numerical types for which it's less obvious > > how to express value constraints, for which BETWEEN syntax goes from not > > just "nice to have" but "required for expression" -- ie, where the > > semantics of greater-than, less-than do not make sense but BETWEEN does. > > > > > > > > (7) Is there any thought around a library of functions for primitive > > types? For instance, a length() function that operates on strings. > > > > > > > > (8) The next logical question from (7) is where would UDF's or > > implementation-specific extensions be installed into the syntax? It would > > be very powerful to have a Function Object (including a function > > implementation) that can be exported along with the Indicators that contain > > Patterns that make use of that function. > > > > > > > > (9) Also from (7), any consideration for aggregation operators/functions? > > "Match when the combined file size of all attachments on the Email Message > > is greater than 5mb". Assuming the Email Message Object had a property > > "attachment_refs" of type object-ref list that's restricted to File Object > > types, that might look like "[sum(email-message:attachment_refs[*].size) > > > 5 * 1024 * 1024]". > > > > > > > > (10) It's not clear to me if the RHS from the example in (9) is even valid > > -- are arbitrary mathematical expressions legal for either LHS or RHS of > > comparison expressions? > > > > > > > > (11) Capability for backward references. I'd like to be able to refer to > > the value that matched a previous expression later in the pattern. For > > instance, "match when an email contains a URL hosted on baddomain.com and > > subsequent http traffic contains a 200 success request for that URL". One > > approach might be to borrow group backreferences from regex syntax, > > "[(email-message:url_refs[*].value MATCHES '*.baddomain.com/.*\.docx' > > <http://baddomain.com/.*%5C.docx'>)] FOLLOWED BY > > [network-traffic:extensions.http-request-ext.request_value = \1 AND > > network-traffic:extensions.x-examplecom-http-request-ext.status_code = > > 200]". I don't quite know how this would work -- you'd want to define > > capturing groups of arbitrary expressions and also allow for capturing > > groups to be embedded into the RHS of a MATCHES expression. Tricky. > > -- John-Mark
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]