OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-users message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-users] STIX 2.0 Pattern Expressiveness


Nick Dimiduk wrote this message on Wed, Dec 28, 2016 at 14:03 -0800:
> Thanks for the comments. I've spent some time with the antlr grammar [0],
> which has answered some questions and introduced some others.

Just so you know, the antlr grammar is not normative.  If grammar
disagrees w/ the spec, then the grammar is wrong and needs to be
fixed...

> Part 3 Section 2 defines a bunch of data types that are not represented in
> the grammar (list, open-vocab, timestamp, binary, hex, dictionary).
> Obviously some of them could be specified a type-name aliases of others. Is
> there a plan to update the grammar with support for these types?

No...  open-vocab is just a string...  timestamp is defined now... binary and
hex are interchangeable (see 4.2.3:
https://docs.google.com/document/d/1suvd7z7YjNKWOwgko-vJ84jfGuxSYZjOQlw5leCswPY/edit#heading=h.hwcrgiy40ia0
), and dictonaries are already addressed by the specification...

> Has anyone discussed making the square brackets optional? For extremely
> simple patterns, this would be a nice convenience for users. I think it can
> be expressed in the grammar in an unambiguous manner.

No, not really, and now that we have AND/OR at the higher level, I'm
afraid that would cause too much confusion...

> On Fri, Dec 23, 2016 at 8:05 AM, Kirillov, Ivan A. <ikirillov@mitre.org>
> wrote:
> 
> > Hi Nick,
> >
> >
> >
> > Thanks for the great feedback on STIX Patterning! Overall, I think we had
> > considered many of the points that you’ve raised, and pushed back on them
> > to focus on a “minimum viable product” release of STIX Patterning that
> > would be useful for the vast majority of basic patterns (i.e., those seen
> > in the wild today). However, I think many of these are great topics for a
> > future release of STIX Patterning.
> >
> >
> >
> > (1), (2): This isn’t something we’ve considered, though I agree that it’s
> > a useful and likely necessary capability. I think we could probably use the
> > same syntax for testing for absent dictionary keys and Object Paths. I
> > rather like the "[file:size != nil]" syntax that you’ve proposed for this
> > purpose.
> >
> >
> >
> > (3): This was done intentionally, as we felt that for the initial
> > patterning release it would be simpler and more consistent to have the
> > Object Path always on the LHS and literal value on the RHS. However, I
> > think it’s likely that in a future release we will allow Object Paths on
> > the RHS as well.
> >
> >
> >
> > (4): Good point here, I can see how the divergence between dictionary and
> > list type lookup syntax is odd. I’m wondering if we should just use square
> > bracket notation for dictionary values (as in Python) as well? E.g.,
> > “file:hashes[MD5]” instead of “file:hashes.MD5”. This would also allow us
> > to use an equivalent syntax for ANY.
> >
> >
> >
> > (5) I think we can consider adding support for something like ALL, SOME,
> > and EXISTS in a future release.
> >
> >
> >
> > (6) As you mentioned, right now we can support value constraints using AND
> > with the same property. If there are indeed constraints that we can’t
> > express using this notation (maybe for timestamps?), then I think adding
> > something like a BETWEEN or INRANGE operator makes sense.
> >
> >
> >
> > (7), (8): We’ve thought a bit about defining function in the language and
> > decided to postpone them to a future release. Most of our discussion has
> > been on functions for casting back and forth between constant types (e.g.,
> > hex -> integer), but other types of functions for primitive types
> > definitely makes sense. As far as user-defined functions, this isn’t
> > something we’ve discussed, and while I see the utility in them I think they
> > would also have the potential to make patterning much more complex,
> > especially for implementers/consumers. That said, I think it’s an
> > interesting idea and one worth discussing amongst our community.
> >
> >
> >
> > (9), (10): We’ve briefly discussed aggregator functions and I think it’s
> > certainly something we can add once we incorporate functions in general. As
> > far as arbitrary mathematical expressions, they are currently not legal,
> > though this is also something that we’ll likely add in a later release as
> > well (we actually had them in an early draft and decided to remove them for
> > the sake of simplicity).
> >
> >
> >
> > (11): This has also been discussed, and will likely be implemented in a
> > future release. One possibility we’ve floated for such a capability is to
> > add the ability to define variables and accordingly substitute them in
> > Object Paths. E.g., [{0} = “foo.dll” AND file:name = {0}] ALONGWITH
> > [win-registry-key:key MATCHES {0}].
> >
> >
> >
> > Regards,
> >
> > Ivan Kirillov
> >
> > Cyber Observables SC Co-chair
> >
> >
> >
> > *From: *<cti-users@lists.oasis-open.org> on behalf of Nick Dimiduk <
> > ndimiduk@gmail.com>
> > *Date: *Thursday, December 22, 2016 at 2:35 PM
> > *To: *"cti-users@lists.oasis-open.org" <cti-users@lists.oasis-open.org>
> > *Subject: *[cti-users] STIX 2.0 Pattern Expressiveness
> >
> >
> >
> > Hello,
> >
> >
> >
> > I'm new to STIX and I've been evaluating the use of STIX 2.0RC3 Patterns
> > ([0]) for some use-cases. I find it to be quite a powerful tool. However,
> > there are a couple concepts I don't know how to express. I'm hoping the
> > community might be able to help me out -- either there's a usage I don't
> > see (missed in my reading) or there's an oversight in the language. For any
> > of the latter, I hope I'm not too late for my suggestions to be considered
> > for 2.0 timelines.
> >
> >
> >
> > (1) Dictionary key absence. For instance, looking for some network-traffic
> > property AND the absence of a specific HTTP header. Do I treat the
> > dictionary as a collection and use NOT IN comparison operator? That seems
> > to violate the grammar rules, which say the LHS (left-hand side) of an IN
> > clause is the Object Path and the RHS (right-hand side) is a set of
> > constant values.
> >
> >
> >
> > (2) Related to (1), the same syntax question is raised for optional Object
> > Paths (properties) which are not present. For example, a path like
> > "file:size" describes an optional property of the File Object. How to check
> > for (the absence of) that field? Is there a "null/nil/None" object value
> > that can be checked for? What's the syntax for the check? What primitive
> > types support it? Some candidate syntax comes to mind: "[file:size != nil]"
> > or maybe "[file:size IS NOT null]". Where is that discussed in the spec?
> >
> >
> >
> > (3) Also related to (1), the comparison expression is always Object Path
> > on the LHS, literal value on the RHS. This is inflexible, and means there's
> > no way to compare two Object Paths to each other. It also means I cannot
> > check to see if Object Path A is present in Object Path B where B is a
> > collection.
> >
> >
> >
> > (4) Speaking of collections, collection object lookup syntax is divergent.
> > For list types (Part 4, Section 5.2), we have a 0-based index with
> > square-brackets ('[]'). We also have a convenience syntax of
> > "list_property[*]" as syntactic sugar for the logical ANY operator from
> > SQL. However, dictionary types (Part 4, Section 5.3) are referenced with
> > "dot-notation", just like any object property. This obscures the property's
> > type for a casual reader and restricts (or at least confuses) the body of
> > syntax available for expressions on dictionary elements. For example,
> > there's no ANY equivalent for matching dictionary members like we can for
> > lists.
> >
> >
> >
> > (5) Related to (4), there appears to be no syntax for the other
> > collection-based logical operators provided by SQL -- ALL, SOME, EXISTS. As
> > mentioned in (2), (3), there is IN syntax, but it's not available for
> > Object Path element collections, only constant set literals.
> >
> >
> >
> > (6) While talking about logical operators, I haven't noticed the
> > equivalent of a BETWEEN expression. One must say, for example
> > "[type:property > A AND type:property < B]". I haven't thought in depth on
> > this topic, could be other non-numerical types for which it's less obvious
> > how to express value constraints, for which BETWEEN syntax goes from not
> > just "nice to have" but "required for expression" -- ie, where the
> > semantics of greater-than, less-than do not make sense but BETWEEN does.
> >
> >
> >
> > (7) Is there any thought around a library of functions for primitive
> > types? For instance, a length() function that operates on strings.
> >
> >
> >
> > (8) The next logical question from (7) is where would UDF's or
> > implementation-specific extensions be installed into the syntax? It would
> > be very powerful to have a Function Object (including a function
> > implementation) that can be exported along with the Indicators that contain
> > Patterns that make use of that function.
> >
> >
> >
> > (9) Also from (7), any consideration for aggregation operators/functions?
> > "Match when the combined file size of all attachments on the Email Message
> > is greater than 5mb". Assuming the Email Message Object had a property
> > "attachment_refs" of type object-ref list that's restricted to File Object
> > types, that might look like "[sum(email-message:attachment_refs[*].size)
> > > 5 * 1024 * 1024]".
> >
> >
> >
> > (10) It's not clear to me if the RHS from the example in (9) is even valid
> > -- are arbitrary mathematical expressions legal for either LHS or RHS of
> > comparison expressions?
> >
> >
> >
> > (11) Capability for backward references. I'd like to be able to refer to
> > the value that matched a previous expression later in the pattern. For
> > instance, "match when an email contains a URL hosted on baddomain.com and
> > subsequent http traffic contains a 200 success request for that URL". One
> > approach might be to borrow group backreferences from regex syntax,
> > "[(email-message:url_refs[*].value MATCHES '*.baddomain.com/.*\.docx'
> > <http://baddomain.com/.*%5C.docx'>)] FOLLOWED BY
> > [network-traffic:extensions.http-request-ext.request_value = \1 AND
> > network-traffic:extensions.x-examplecom-http-request-ext.status_code =
> > 200]". I don't quite know how this would work -- you'd want to define
> > capturing groups of arbitrary expressions and also allow for capturing
> > groups to be embedded into the RHS of a MATCHES expression. Tricky.
> >

-- 
John-Mark


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]