[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Embedded Relationships
I wanted to start a debate on the relationship modeling in the STIX 2.0 proposal. I would like to propose switching all of the relationships to embedded relationships as opposed to creating Relationship classes. I’m new to the group, so I don’t know if you have already discussed this. I wanted to bring up some of my thoughts. I feel that the relationships should all be embedded for three primary reasons. 1. There are two ways to define relationships You can define a new relationship by creating a Relationship with a custom type, and you can define a new relationship by adding a custom property to an SDO. It’s unclear as to which approach to choose. For example, I have a new relationship where Campaigns “foo” Malware. I could add a property to the Campaign class called “x_example_com_foo” that references a Malware or I could define a new Relationship with source of Campaign, target of Malware, and type of “foo.” 2. Performance I believe that for the common use case of displaying the properties for an SDO there will be a performance hit for this. I included some analysis below in Appendix A that shows what I believe to a be a 2n penalty for modeling using Relationship top-level classes. 3. Maintainability of code and JSON I believe the code will be easier to maintain if we use embedded relationships. In Appendix B I included some class diagrams for how the system will likely be modeled. There will likely need to be a lot more classes and code for supporting Relationship subclasses. I know this is a long email and a big proposed change, but I would appreciate some feedback on this. This was the biggest area of concern for me while reading through the spec. -Nate P.S. the ASCII art below requires a fixed width font for accurately displaying them. I hope they appear correctly for you. Otherwise I can try creating them some place else. ---------------------------------- Appendix A ---------------------------------- This document compares the common use case of displaying information to an analyst using two different object modeling methods. There is the graph modeling approach proposed by the CTI TC and the embedded relationship approach that I am proposing. The use case should be a very common scenario. There will likely be a graphical user interface, most likely a web interface, that is used to display information about campaigns, threat actors, malware, etc to analysts. The analysts will want this web interface to view those objects and click on links to related objects, like wanting to view more details of the malware used by a campaign. Each modeling approach (graph and embedded) has pseudocode written to demonstrate the code for displaying information about a campaign. The expected output is below. While the output below is simply text based the process is the same for HTML display. The only difference is that everything will be wrapped in HTML tags before printing. In the interest of brevity the HTML tags were omitted. ------------- Expected Output ------------- Campaign: campaign--8e2e2d2b-17d4-4cbf-938f-98ee46b3cd3f Name: Green Group attacks against France Description: Green Group launched a phishing attack against France heads of state First Seen: 2016-04-06T20:03:00.000Z Last Seen: 2016-04-06T20:03:00.000Z Threat Actors: threat_actor--a32e2d2b-17d4-4cbf-938f-98ee46b3cd3f Identities Targeted: identity--a32e2d2b-17d4-4cbf-938f-98ee46b3cd3f identity--b42e2d2b-17d4-4cbf-938f-98ee46b3cd33 Vulnerabilities Targeted: vulnerability--b42e2d2b-17d4-4cbf-938f-98ee46b3cd33 Malware Used: malware--b42e2d2b-17d4-4cbf-938f-98ee46b3cd33 malware--a32e2d2b-17d4-4cbf-938f-98ee46b3cd3f Here is what the code will look like for a graph-based object model like the one proposed in Stix 2.0. void printCampaign(campaignId) { Campaign c = repo.getCampaign(campaignId) List<Relationship> relationships = repo.getRelationships(c) print “Campaign: %s\n”, c.getId() print “Name: %s\n”, c.getName() print “Description: %s\n”, c.getDescription() print “First Seen: %s\n”, c.getFirstSeen() print “Last Seen: %s\n”, c.getLastSeen() // categorize each relationship threatActors = new List<Relationship>() identitiesTargeted = new List<Relationship>() vulnerabilitiesTargeted = new List<Relationship>() malwareUsed = new List<Relationship>() for r in relationships if r.getType() == “attributed-to” && r.getTarget() instanceof ThreatActor threatActors.add(r) else if r.getType() == “targets” && r.getTarget() instanceof Identity identitiesTargeted.add(r) else if r.getType() == “targets” && r.getTarget() instanceof Vulnerability vulnerabilitiesTargeted.add(r) else if r.getType() == “uses” && r.getTarget() instanceof Malware malwareUsed.add(r) // print relationships print “Threat Actors:\n” for r in threatActors print “ %s\n”, r.getTarget().getId() print “Identities Targeted:\n” for r in relationships print “ %s\n”, r.getTarget().getId() print “Vulnerabilities Targeted:\n” for r in relationships print “ %s\n”, r.getTarget().getId() print “Malware Used:\n” for r in relationships print “ %s\n”, r.getTarget().getId() } Notice how the code must iterate over the relationships twice. The first loop must be done to categorize each relationship. They must be categorized, so the interface can display the correct label. In an HTML view they will likely want to be categorized to add icons, HTML sections that can be hidden and revealed, and other features to make the user experience better. Then each of the objects in the relationship is iterated over again when it prints the relationships. This means that using a graph to model the relationships will have a runtime of 2*n where n is the number of relationships. The code below shows the code for embedding the relationships in the objects themselves. void printCampaign(campaignId) { Campaign c = repo.getCampaign(campaignId) print “Campaign: %s\n”, c.getId() print “Name: %s\n”, c.getName() print “Description: %s\n”, c.getDescription() print “First Seen: %s\n”, c.getFirstSeen() print “Last Seen: %s\n”, c.getLastSeen() // print relationships print “Threat Actors:\n” for r in c.getThreatActors() print “ %s\n”, r.getTarget().getId() print “Identities Targeted:\n” for r in c.getIdentitiesTargeted() print “ %s\n”, r.getTarget().getId() print “Vulnerabilities Targeted:\n” for r in c.getVulnerabilitiesTargeted() print “ %s\n”, r.getTarget().getId() print “Malware Used:\n” for r in c.getMalwareUsed() print “ %s\n”, r.getTarget().getId() } This code will have a runtime of n where n is the number of relationships. This means that embedding the relationships will run twice as fast as the graph approach because the relationships do not need to be categorized. This code is easier to read as well. One other advantage of the embedded model is that it contains less potential runtime errors. The graph model loops over each item in the set of relationships and performs a comparison to categorize the relationship. This creates possible errors that will only be detected at runtime. For example the code has a check that if type == "uses". If the programmer had mistyped "uses," and instead typed "yses," then no malware relationships would be displayed, and the code would still compile and run. If the programmer however mistyped c.getMalwareUsed() then a compile time error would have caught the mistake if they were using a type safe language like Java. ---------------------------------- Appendix B ---------------------------------- This document compares the code for STIX objects using two different approaches. The first approach codes relationships using embedded relationships while the second approach codes relationships using a Relationship class as specified in the proposed STIX 2.0 specification. Below is a UML class diagram showing some of the relationships between a Campaign and other STIX Domain Objects (SDOs). There is the base Stix class and all SDOs are subclasses of it. The Campaign class is shown with four of its relationships. Not all relationships were shown in the interest of brevity. The diagram shows there is an "attributed-to" relationship between Campaigns and ThreatActors. The nice aspect of this relationship is that a developer can clearly and quickly see in the Campaign class which other SDOs it has relationships with because there are attributes in the class that declare them. For each relationship the code simply needs to add an attribute to an SDO class and provide getters and setters for retrieving and modifying that relationship. +-----------------------+ | Stix | +-----------------------+ |id: UUID | |creator: Identity | |created: Timestamp | |modified: Timestamp | |revoked: boolean | | | +-----------+-----------+ ^ | +-------------------------+--------------------------------+ | | | | | attributed-to +-----------------+ | | +-------------> | ThreatActor | <-+ | | +-----------------+ | | | | | | | | +-----------------+ | | | | +------------------+--------------------+----+ targets +-----------------+ | | Campaign +--------> | Vulnerability | <-+ +--------------------------------------------+ +-----------------+ | | | | | | | threatActors: List<ThreatActor> | +-----------------+ | | vulnerabilityTargets: List<Vulnerability> | | | identityTargets: List<Identity> | targets +-----------------+ | | malwareUsed: List<Malware> +--------> | Identity | <-+ | | +-----------------+ | +---------------------------------------+----+ | | | | +-----------------+ | | | | uses +-----------------+ | +-------------> | Malware | <-+ +-----------------+ | | +-----------------+ The next diagram below illustrates how relationships will be coded using the Relationship concept in the STIX 2.0 specification. In the specification it states that relationships between SDOs should be modeled using a Relationship class. This requires the creation of a Relationship class and subclasses. +--------------------------------+ +------+ CampaignThreatActors | | +--------------------------------+ | | | | | source: Campaign | | | target: ThreatActor | | | | | +--------------------------------+ | | +-------------------+ | +--------------------------------+ | Relationship | <--------------+ CampaignTargetVulnerability | +-------------------+ | +--------------------------------+ | | | | | | source: Stix | | | source: Campaign | | target: Stix | | | target: Vulnerability | | | | | | +--------+----------+ | +--------------------------------+ | | | | +--------------------------------+ | +------+ CampaignTargetIdentity | | | +--------------------------------+ | | | | | | | source: Campaign | | | | target: Identity | | | | | | | +--------------------------------+ | | | | +--------------------------------+ | +------+ CampaignMalware | | +--------------------------------+ | | | | | source: Campaign | | | target: Malware | | | | | +--------------------------------+ | | | | | has +----------------------+ +-----------------> | Stix | +----------------------+ |id: UUID | |creator: Identity | |created: Timestamp | |modified: Timestamp | |revoked: boolean | | | +-----------+----------+ ^ | +-------------------------+--------------------------------+ | | | | | +-----------------+ | | | ThreatActor | <-+ | +-----------------+ | | | | | | +-----------------+ | | | +------------------+-------------------------+ +-----------------+ | | Campaign | | Vulnerability | <-+ +--------------------------------------------+ +-----------------+ | | | | | | | | +-----------------+ | | | | | | +-----------------+ | | | | Identity | <-+ | | +-----------------+ | +--------------------------------------------+ | | | +-----------------+ | | +-----------------+ | | Malware | <-+ +-----------------+ | | +-----------------+ Notice that for each relationship between SDOs there is a new class to model that relationship. That means for each tuple of (source SDO, relationship type, target SDO) there will be a new class. In the proposed draft of the spec there are already 40 relationships defined, for a minimally viable product. That number is likely to grow quickly. It also adds more overhead for a developer because instead of adding a property and a couple of methods for getters and setters they now need to maintain an entire new class. Another downside is that when a developer reads the code for SDOs they cannot quickly see what the relationships are between SDOs. They must examine the packages that contain the Relationship subclasses to determine which relationships exist between SDOs. A similar property is true for the JSON output. It is slower to identify relationships between objects using a top-level object of Relationship. The SDOs are first read and then the Relationships are read. If the relationships are embedded in the JSON then the developer can quickly analyze the count and types of relationships for a given SDO. Otherwise the programmer may have to develop some scripts to extract that information to help with debugging.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]