cti message

Subject: Embedded Relationships
From: "Reller, Nathan S." <Nathan.Reller@jhuapl.edu>
To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Date: Mon, 17 Apr 2017 17:24:30 +0000
I wanted to start a debate on the relationship modeling in the STIX 2.0 proposal. I would like to propose switching all of the relationships to embedded relationships as opposed to creating Relationship classes. I’m new to the group, so I don’t know if you have already discussed this. I wanted to bring up some of my thoughts.

I feel that the relationships should all be embedded for three primary reasons.

1. There are two ways to define relationships

You can define a new relationship by creating a Relationship with a custom type, and you can define a new relationship by adding a custom property to an SDO. It’s unclear as to which approach to choose. For example, I have a new relationship where Campaigns “foo” Malware. I could add a property to the Campaign class called “x_example_com_foo” that references a Malware or I could define a new Relationship with source of Campaign, target of Malware, and type of “foo.”

2. Performance

I believe that for the common use case of displaying the properties for an SDO there will be a performance hit for this. I included some analysis below in Appendix A that shows what I believe to a be a 2n penalty for modeling using Relationship top-level classes.

3. Maintainability of code and JSON

I believe the code will be easier to maintain if we use embedded relationships. In Appendix B I included some class diagrams for how the system will likely be modeled. There will likely need to be a lot more classes and code for supporting Relationship subclasses.

I know this is a long email and a big proposed change, but I would appreciate some feedback on this. This was the biggest area of concern for me while reading through the spec.

-Nate

P.S. the ASCII art below requires a fixed width font for accurately displaying them. I hope they appear correctly for you. Otherwise I can try creating them some place else.


---------------------------------- Appendix A ----------------------------------

This document compares the common use case of displaying information to an
analyst using two different object modeling methods. There is the graph modeling
approach proposed by the CTI TC and the embedded relationship approach that I am
proposing. The use case should be a very common scenario. There will likely be a
graphical user interface, most likely a web interface, that is used to display
information about campaigns, threat actors, malware, etc to analysts. The
analysts will want this web interface to view those objects and click on links
to related objects, like wanting to view more details of the malware used by a
campaign.

Each modeling approach (graph and embedded) has pseudocode written to
demonstrate the code for displaying information about a campaign. The expected
output is below. While the output below is simply text based the process is the
same for HTML display. The only difference is that everything will be wrapped in
HTML tags before printing. In the interest of brevity the HTML tags were
omitted.

------------- Expected Output ------------- 
Campaign: campaign--8e2e2d2b-17d4-4cbf-938f-98ee46b3cd3f
Name: Green Group attacks against France
Description: Green Group launched a phishing attack against France heads of state
First Seen: 2016-04-06T20:03:00.000Z
Last Seen: 2016-04-06T20:03:00.000Z
Threat Actors:
    threat_actor--a32e2d2b-17d4-4cbf-938f-98ee46b3cd3f
Identities Targeted:
    identity--a32e2d2b-17d4-4cbf-938f-98ee46b3cd3f
    identity--b42e2d2b-17d4-4cbf-938f-98ee46b3cd33
Vulnerabilities Targeted:
    vulnerability--b42e2d2b-17d4-4cbf-938f-98ee46b3cd33
Malware Used:
    malware--b42e2d2b-17d4-4cbf-938f-98ee46b3cd33
    malware--a32e2d2b-17d4-4cbf-938f-98ee46b3cd3f

Here is what the code will look like for a graph-based object model like the
one proposed in Stix 2.0.

void printCampaign(campaignId) {
    Campaign c = repo.getCampaign(campaignId)
    List<Relationship> relationships = repo.getRelationships(c)
    print “Campaign: %s\n”, c.getId()
    print “Name: %s\n”, c.getName()
    print “Description: %s\n”, c.getDescription()
    print “First Seen: %s\n”, c.getFirstSeen()
    print “Last Seen: %s\n”, c.getLastSeen()

    // categorize each relationship
    threatActors = new List<Relationship>()
    identitiesTargeted = new List<Relationship>()
    vulnerabilitiesTargeted = new List<Relationship>()
    malwareUsed = new List<Relationship>()
    for r in relationships
        if r.getType() == “attributed-to” && r.getTarget() instanceof ThreatActor
            threatActors.add(r)
        else if r.getType() == “targets” && r.getTarget() instanceof Identity
            identitiesTargeted.add(r)
        else if r.getType() == “targets” && r.getTarget() instanceof Vulnerability
            vulnerabilitiesTargeted.add(r)
        else if r.getType() == “uses” && r.getTarget() instanceof Malware
            malwareUsed.add(r)

    // print relationships
    print “Threat Actors:\n”
    for r in threatActors
        print “    %s\n”, r.getTarget().getId()
    print “Identities Targeted:\n”
    for r in relationships
        print “    %s\n”, r.getTarget().getId()
    print “Vulnerabilities Targeted:\n”
    for r in relationships
        print “    %s\n”, r.getTarget().getId()
    print “Malware Used:\n”
    for r in relationships
        print “    %s\n”, r.getTarget().getId()
}

Notice how the code must iterate over the relationships twice. The first loop
must be done to categorize each relationship. They must be categorized, so the
interface can display the correct label. In an HTML view they will likely want
to be categorized to add icons, HTML sections that can be hidden and revealed,
and other features to make the user experience better. Then each of the objects
in the relationship is iterated over again when it prints the relationships.
This means that using a graph to model the relationships will have a runtime of
2*n where n is the number of relationships.

The code below shows the code for embedding the relationships in the objects
themselves.

void printCampaign(campaignId) {
    Campaign c = repo.getCampaign(campaignId)
    print “Campaign: %s\n”, c.getId()
    print “Name: %s\n”, c.getName()
    print “Description: %s\n”, c.getDescription()
    print “First Seen: %s\n”, c.getFirstSeen()
    print “Last Seen: %s\n”, c.getLastSeen()

    // print relationships
    print “Threat Actors:\n”
    for r in c.getThreatActors()
        print “    %s\n”, r.getTarget().getId()
    print “Identities Targeted:\n”
    for r in c.getIdentitiesTargeted()
        print “    %s\n”, r.getTarget().getId()
    print “Vulnerabilities Targeted:\n”
    for r in c.getVulnerabilitiesTargeted()
        print “    %s\n”, r.getTarget().getId()
    print “Malware Used:\n”
    for r in c.getMalwareUsed()
        print “    %s\n”, r.getTarget().getId()
}

This code will have a runtime of n where n is the number of relationships. This
means that embedding the relationships will run twice as fast as the graph
approach because the relationships do not need to be categorized. This code is
easier to read as well.

One other advantage of the embedded model is that it contains less potential
runtime errors. The graph model loops over each item in the set of relationships
and performs a comparison to categorize the relationship. This creates possible
errors that will only be detected at runtime. For example the code has a check
that if type == "uses". If the programmer had mistyped "uses," and instead typed
"yses," then no malware relationships would be displayed, and the code would
still compile and run. If the programmer however mistyped c.getMalwareUsed()
then a compile time error would have caught the mistake if they were using a
type safe language like Java.




---------------------------------- Appendix B ----------------------------------

This document compares the code for STIX objects using two different
approaches. The first approach codes relationships using embedded relationships
while the second approach codes relationships using a Relationship class as
specified in the proposed STIX 2.0 specification.

Below is a UML class diagram showing some of the relationships between a
Campaign and other STIX Domain Objects (SDOs). There is the base Stix class and
all SDOs are subclasses of it. The Campaign class is shown with four of its
relationships. Not all relationships were shown in the interest of brevity.

The diagram shows there is an "attributed-to" relationship between Campaigns
and ThreatActors. The nice aspect of this relationship is that a developer can
clearly and quickly see in the Campaign class which other SDOs it has
relationships with because there are attributes in the class that declare them.
For each relationship the code simply needs to add an attribute to an SDO class
and provide getters and setters for retrieving and modifying that relationship.


                                 +-----------------------+
                                 |        Stix           |
                                 +-----------------------+
                                 |id: UUID               |
                                 |creator: Identity      |
                                 |created: Timestamp     |
                                 |modified: Timestamp    |
                                 |revoked: boolean       |
                                 |                       |
                                 +-----------+-----------+
                                             ^
                                             |
                   +-------------------------+--------------------------------+
                   |                                                          |
                   |                                                          |
                   |                     attributed-to  +-----------------+   |
                   |                    +-------------> |   ThreatActor   | <-+
                   |                    |               +-----------------+   |
                   |                    |               |                 |   |
                   |                    |               +-----------------+   |
                   |                    |                                     |
+------------------+--------------------+----+ targets  +-----------------+   |
|        Campaign                            +--------> | Vulnerability   | <-+
+--------------------------------------------+          +-----------------+   |
|                                            |          |                 |   |
| threatActors: List<ThreatActor>            |          +-----------------+   |
| vulnerabilityTargets: List<Vulnerability>  |                                |
| identityTargets: List<Identity>            | targets  +-----------------+   |
| malwareUsed: List<Malware>                 +--------> |  Identity       | <-+
|                                            |          +-----------------+   |
+---------------------------------------+----+          |                 |   |
                                        |               +-----------------+   |
                                        |                                     |
                                        |      uses     +-----------------+   |
                                        +-------------> |  Malware        | <-+
                                                        +-----------------+
                                                        |                 |
                                                        +-----------------+


The next diagram below illustrates how relationships will be coded using the
Relationship concept in the STIX 2.0 specification. In the specification it
states that relationships between SDOs should be modeled using a Relationship
class. This requires the creation of a Relationship class and subclasses.

                                         +--------------------------------+
                                  +------+   CampaignThreatActors         |
                                  |      +--------------------------------+
                                  |      |                                |
                                  |      |  source: Campaign              |
                                  |      |  target: ThreatActor           |
                                  |      |                                |
                                  |      +--------------------------------+
                                  |
                                  |
    +-------------------+         |      +--------------------------------+
    |     Relationship  | <--------------+  CampaignTargetVulnerability   |
    +-------------------+         |      +--------------------------------+
    |                   |         |      |                                |
    | source: Stix      |         |      | source: Campaign               |
    | target: Stix      |         |      | target: Vulnerability          |
    |                   |         |      |                                |
    +--------+----------+         |      +--------------------------------+
             |                    |
             |                    |      +--------------------------------+
             |                    +------+  CampaignTargetIdentity        |
             |                    |      +--------------------------------+
             |                    |      |                                |
             |                    |      | source: Campaign               |
             |                    |      | target: Identity               |
             |                    |      |                                |
             |                    |      +--------------------------------+
             |                    |
             |                    |      +--------------------------------+
             |                    +------+  CampaignMalware               |
             |                           +--------------------------------+
             |                           |                                |
             |                           | source: Campaign               |
             |                           | target: Malware                |
             |                           |                                |
             |                           +--------------------------------+
             |
             |
             |
             |
             |    has            +----------------------+
             +-----------------> |        Stix          |
                                 +----------------------+
                                 |id: UUID              |
                                 |creator: Identity     |
                                 |created: Timestamp    |
                                 |modified: Timestamp   |
                                 |revoked: boolean      |
                                 |                      |
                                 +-----------+----------+
                                             ^
                                             |
                   +-------------------------+--------------------------------+
                   |                                                          |
                   |                                                          |
                   |                                    +-----------------+   |
                   |                                    |   ThreatActor   | <-+
                   |                                    +-----------------+   |
                   |                                    |                 |   |
                   |                                    +-----------------+   |
                   |                                                          |
+------------------+-------------------------+          +-----------------+   |
|        Campaign                            |          | Vulnerability   | <-+
+--------------------------------------------+          +-----------------+   |
|                                            |          |                 |   |
|                                            |          +-----------------+   |
|                                            |                                |
|                                            |          +-----------------+   |
|                                            |          |  Identity       | <-+
|                                            |          +-----------------+   |
+--------------------------------------------+          |                 |   |
                                                        +-----------------+   |
                                                                              |
                                                        +-----------------+   |
                                                        |  Malware        | <-+
                                                        +-----------------+
                                                        |                 |
                                                        +-----------------+

Notice that for each relationship between SDOs there is a new class to model
that relationship. That means for each tuple of (source SDO, relationship
type, target SDO) there will be a new class. In the proposed draft of the spec
there are already 40 relationships defined, for a minimally viable product.
That number is likely to grow quickly. It also adds more overhead for a
developer because instead of adding a property and a couple of methods for
getters and setters they now need to maintain an entire new class.

Another downside is that when a developer reads the code for SDOs they cannot
quickly see what the relationships are between SDOs. They must examine the
packages that contain the Relationship subclasses to determine which
relationships exist between SDOs.

A similar property is true for the JSON output. It is slower to identify
relationships between objects using a top-level object of Relationship. The
SDOs are first read and then the Relationships are read. If the relationships
are embedded in the JSON then the developer can quickly analyze the count and
types of relationships for a given SDO. Otherwise the programmer may have to
develop some scripts to extract that information to help with debugging.
Attachment: smime.p7s
Description: S/MIME cryptographic signature
Follow-Ups:
- Re: [EXT] [cti] Embedded Relationships
  - From: Bret Jordan <Bret_Jordan@symantec.com>