I think that what this really boils down to is folks currently defining Report differently than it has been to date.
It was actually originally requested to serve more of a dynamic aggregation function to say “hey, this set of things are related in this way” (e.g. a set of Indicators for a recently discovered malware). I had pointed out that it could also be used for more
explicit subclasses of this sort of aggregation use case, ones where you are actually asserting a more formal report (and eventually you could maybe generate your report document from the STIX Report). We then defined the 1.X Report object to serve these roles.
I am now hearing many people asserting that Report should be viewed ONLY as the latter formal, point-in-time report document and something else would be needed for the original more dynamic aggregation use cases.
If we do redefine Report this way then we will need a new object for the looser context aggregation.
Either way, Rich P is likely correct in that “contains” is likely not a great value for the kind_of_relationship. We should think about potentially better options.
I think we very close to reaching consensus on the Report object.
Although as I mentioned to Sean in another email – if we use the Relationship object to express the confidence of something being related to the report – we don’t
name the relationship “report contains”, since I think the id-ref list in the report will completely specify what TLOs are contained/discussed/related to the report. What you want to express is not the confidence that a TLO is contained in the report, but
that you are asserting something about that TLO’s inclusion.
+1 with Sean’s both with going with #2, since it doesn’t preclude the ability to express confidence, etc.
In addition, it is critical that report maintain the support for ‘intents’ as we generate very different types of reports with very different
I’m not sure that the below characterization fully conveys the conversation that occurred ( I don’t think anyone was arguing for #1 by
the end of the conversation and many other pros/cons were discussed) but I won’t spend the time to go into all of those details.
While there are real world use cases where you will want to relate STIX content to a Report with the ability to assert confidence, option
#2 does not preclude such relationships from being specified externally by parties who wish to do so.
Given this fact I am fine with moving forward on #2.
The one exception I will raise is that the below characterization leaves out an important element that should exist for any of #1-#4. That
is the “intents” property that is a controlled vocabulary field that allows clear and consistent characterization of what sort of report it is (e.g. Campaign Analysis Report, Malware Report, Threat Actor Report, Threat Trend Report, etc.). This capability
was always part of what was asked for with Report, is part of STIX 1.x, was part of the
2.0 consensus we had on Report object before this latest issue was raised and I believe is still quite valuable. I think we need to make sure that this field is included with Report.
We had a discussion on Slack about the Report object yesterday and I wanted to summarize for those of you who don’t want to read a 500msg
back and forth. The people involved were myself, Bret Jordan, Jason Keirstead, Mark Davidson, Rich Piazza, John Mark-Gurney, and Sean Barnum.
The topic of discussion was how you relate objects to a report, and there were essentially four options that were discussed:
1. The report contains just a title, description, and other TLO properties. Content is placed in the report through the use of relationships
with a FROM of the report, a TO of the TLO, and a kind_of_relationship=“contains"
2. The report contains a title, description, other TLO properties, and a list of idrefs for the content that the producer says are “in”
3. A hybrid approach where the report contains the same items as #2, but the idrefs point to the relationships from #1.
4. A further hybrid where the report idrefs point to EITHER content as in #2 or relationships as in #3.
#3 and #4 are mainly compromise positions if you like #1 or #2 but are OK with something more optional/flexible.
By the end of the conversation most of the group was gravitating towards #2, including myself. The reasons for this are that we felt:
- The report and the references to the content it contains can be signed as a single TLO, verifying that it is what we think it is
- Changes to what’s in the report are versioned with the report itself
- Only the original producer of the report indicates what’s in it, having other people indicate what’s in it is not an important use case
(they can just issue their own report)
- There’s not a strong desire to represent how something is in a report (relationship nature or value field) or the confidence that something
is in a report
- #3 in particular has a lot of double-booking, it’s redundant to represent that the information is contained in the report twice
On the other hand, one or more people also felt that #1 was the right approach, because they felt that:
- Signatures and versioning are just as doable in #1
- Having other people indicate things are in the report is a use case, and in any case even if it isn’t you can tell what the producer
is asserting by looking at their sources
- Reports may evolve over time and relationships enable
- That objects should “belong” to the report with some level of confidence
- It might be important to say how content is in a report via kind_of_relationship (or, if not, it’s not harmful)
- It’s one way of doing things to have all references between TLOs happen via a relationship object
What do you think about this? Which of those options do you prefer? Let’s try to get some consensus on this so we can push it into the
draft specs and close it.