OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti-stix message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti-stix] Object ID format


Jerome,

Actually, the existing TAXII architecture makes proxying harder. I was looking forward toward a future when TAXII is just a slim varnish over plain-old-HTTP.


That said, rewriting domain names in URL-as-ID fields in XML content shouldn't be difficult at all. We just need to stick to absolute URLs.


JSA




From: Jerome Athias <athiasjerome@gmail.com>
Sent: Wednesday, January 20, 2016 10:22 PM
To: Terry MacDonald
Cc: John Anderson; cti-stix@lists.oasis-open.org
Subject: Re: [cti-stix] Object ID format
 
I share your concerns so thanks for the highlights 
I wonder what confidence level I would give to anonymous information 
I would prefer producer name as mandatory with option for a documented "anonymous" value

Regarding the proxy (or broker) idea, I would have to review the TAXII documentation/architecture 

On Thursday, 21 January 2016, Terry MacDonald <terry@soltra.com> wrote:

Hi John,

 

The wish of some producers to be anonymous is the reason there can’t be a mandatory producer’s domain name (or URL). It has to be optional. Hence section 2.

 

Cheers

 

Terry MacDonald

Senior STIX Subject Matter Expert

SOLTRA | An FS-ISAC and DTCC Company

+61 (407) 203 206 | terry@soltra.com

 

 

From: John Anderson
Sent: Thursday, 21 January 2016 1:02 PM
To: Terry MacDonald <terry@soltra.com>; cti-stix@lists.oasis-open.org
Subject: Re: Object ID format

 

Terry, [footnotes in brackets at the bottom]

What about actual URLs? (Not URIs [1].) Let's actually use http: and make our Resources resolvable via the Web. (And maybe even let Google index some of them!)

 

I love your idea about having meaningful IDs. This screams for Cool URLs [2]. Like this one: http://www.discovery.com/tv-shows/mythbusters/videos/juicing-a-tomato-with-explosives/ (I mean, who wouldn't want to see vegetables go kaboom?!) 

 

The same hide-behind-a-proxy idea of yours would still work...with the added benefit that the proxy could be a real proxy server and could actually...um...proxy the Resource. Maybe we could leverage existing technology that has already solved this problem. [3]

 

Also, you raised an interesting question: Where to go for more context about an object? With URL-as-ID, then you could simply hack bits and see what else is around there.  (Think: "If I have that previous Cool URL, then what do I get when I hack off the juicing-a-tomato-with-explosives bit?") Why can't we apply hackable URLs in our context as well? [4]

 

tl;dr - Are there any insurmountable technical obstacles to using URLs for IDs?

JSA

 

[1] https://www.w3.org/TR/uri-clarification/#contemporary - URIs, URLs, and URNs: Clarifications and Recommendations 1.0, Section 2.1 "Contemporary View"

[2] https://www.w3.org/Provider/Style/URI.html - Sir Tim Berners-Lee's advice on the subject.

[3] http://httpd.apache.org/docs/2.2/mod/mod_proxy.html - Apache...proxying content worldwide since 1995!

[4] This was somewhat of a trick question. The experienced web architect will point out that the context question isn't really solved by hackable URLs. What we really need are Link Relations. Curious? Read Fielding Chapter 5. &#X1f60a





From: cti-stix@lists.oasis-open.org <cti-stix@lists.oasis-open.org> on behalf of Terry MacDonald <terry@soltra.com>
Sent: Wednesday, January 20, 2016 6:47 PM
To: cti-stix@lists.oasis-open.org
Subject: [cti-stix] Object ID format

 

Hi list,

 

I understand that at the F2F there was a proposal to remove namespacing from the STIX ID, so that it would become [construct-type]:[UUID]. This has me a little concerned, as the ID has more of an impact than many people realize. As such, I’ve typed up a wee* little** discussion email to outline my concerns.

 

1.      It reduces the chances that object libraries will work
-------------------------------------------------------------------------

John Wunder and I had been talking about enabling organizations to share easy to remember 'library' IDs. It would be cool if everyone had the ability to actually refer to the single lockheed martin kill chain as the 'lmco.com-kill-chain' for its ID, or an attack-pattern by 'mitre.org-capec-217'.

 

The problem is that the [construct-type]:[UUID] ID format is that it precludes that usage. People will instead be forced to use a randomly generated UUID, and each implementation will need to remember what that is and translate the horrible UUID to a human readable thing. SNMP OIDs anyone? (yuck)

 

A different option would be to allow IDs of the type [construct-type]:[anytext], but that then has a huge problem that there will likely be collisions due to the fact many people think similarly, and will reuse object IDs. This could in turn be fixed by introducing a namespace of some type to constrain the collisions to a small domain controlled by one organization… which leads us back to the [producer-domain-name]:[construct-type]:[anytext] mentioned in the TWIGS proposal.

 

Being pragmatic about it there are a few options we could use to enable the object library idea to work:

a.      With the suggested ID format [construct-type]:[UUID]

In this option, we just let the implementation worry about it, and go around asking Mitre, Lockheed martin and others to produce a list of useful objects and we provide a list of their Object IDs centrally . We will need to maintain a list of commonly used library objects, similarly to the IANA common port numbers, or the IANA common list of MIM Types. Implementers then just use this list to provide the common library objects within their implementations to all their users.

 

This puts more onus on the implementers to provide the same names to the users in their UI, mapping the actual Object IDs to real human readable names, and it means the name of the library object may change between various implementations, which could be problematic for users. We can mitigate that last point by proving a ‘suggested name’ as part of the list of Object IDs.

 

b.      With the suggested ID format  [construct-type]:[UUID] and an alternate ID format of lib:[construct-type]:[anytext]

It might be useful to allow a special type of ID to be used for library objects? This could get messy very quickly.

 

c.       With the ID format [producer_domain_name]:[construct-type]:[UUID]

This is untenable for Organizations that wish to remain anonymous.

Terrys Recommendation: I believe that option 1 can work if we provide a list of library objects, and we provide their recommended names in a centrally controlled JSON file provided by OASIS or a nominated third-party, in a similar way to what IANA does for TCP/UDP port numbers.

 

2.      Consumers need to know who to ask for more information
---------------------------------------------------------------------------------

Sharing relationship objects enables a consumer to recognize that there is an indirect connection between two objects, and they are related, even if the consumer does not have access to the intermediate object itself. This feature allows government CERT types to share just indicators and relationships, and to keep the threat actor that joins them together a secret (apart from the Object ID). This in turn allows the consumers to know that if they see one indicator on their network, they really need look for the other indirectly related indicators associated with that threat actor.

 

This feature also works well for bandwidth conservation. Rather than push out a full selection of relationship objects and their matching target objects, implementers will be potentially able to only push out the relationship objects, and the consumers will be able to request more information when they want it or need it. In many cases consumers only want to know that the objects are related – they don’t care why.

 

To cope with these two scenarios, TWIGS provided the ability for the producers to only send out relationships, and for consumers to ask for more information about those objects if they wanted it. The producer could always say no, but the option was always there.

 

Having the producer’s domain name within the ID ensured that there is a way for the consumer to know who to ask for more information about the object being discussed. This is most important when a consumer receives only relationship objects (edges), and no data objects (nodes). In this scenario we need a way for the relationship object to always contain the domain name of someone who is ultimately able to provide the object to the consumer if the consumer asks for it (and only if the producer allows them to have it).

 

I am fine with the producer’s domain name information being separated out from the ID field, but it really needs to be available somewhere else right next to the object ID if we are going to provide consumers a way to ask for more information easily. This will mean we will need to include a producer_domain_name field in the relationship object for the source_ref and for the target_ref objects to ensure that this data is transmitted and is available to the consumers.

 

The idea for normal (there is a producer_domain_name set) operations, works like this:

·         The producer_domain_name field contains a string representing the domain name of the producer.

·         The consumer would perform a DNS lookup against the DNS Server that is authoratitive for the producers domain name, and requests a TAXII service record.

·         The returned TAXII Service record would in turn point to the location of the TAXII Server for that organization.

·         The consumers TAXII server would then connect directly to the producers TAXII server and request the object ID using TAXII Query.

I am assuming that the groups who didn’t want producer domain name contained are the military/government intel types who wanted a way to remain anonymous. This is an important concern if we are actually going to get government -> private industry sharing to take place.  Well as I see it there are two options allowing requests of objects by ID while still allowing the producer to remain anonymous:

a.      If the producers wish to remain anonymous and still get sightings sent directly to them:

This is a modification of what I proposed in the ‘STIX is difficult’ document I produced for Soltra. It would require the producer_domain_name to ALWAYS be sent along with the ID, so that the producer is known. If a producer wishes to be anonymous, they would need to use a ‘proxy’ organization to ‘hide behind’. The proxy organization would replace the original producer’s domain name with their domain name, and would forward on the translated assertion, so that everyone else in the community would see the information coming from the proxy organization. If anyone replies with a sighting they can reply to the group, or directly to the proxy org. The Proxy org will then reverse the process, translating the proxied object ID back to the original object ID and will then send it on to the anonymous original producer.

This has a massive plus in that the consumers always know who to contact to request more information about an object, so that if they have only received a relationship, they can ask for the objects that the relationship refers to by directly requesting it from the producer (as they know the domain name and can do a TAXII lookup). For clarity this would use a direct TAXII query to the producer’s TAXII server to get the Object.

This has a massive downside that the producer must always use a proxy organization. Organizations have a hard enough time setting up a single TAXII server at present.

b.      If the producers wish to remain anonymous and only see sightings sent to the whole community:

In this scenario the producer_domain_name field is OPTIONAL, although when available is sent along with the ID.  If a producer wishes to be anonymous, they would simply need to omit the producer_domain_name field when they publish their assertion. The consumers have no idea who the producer is and are unable to ask them directly for more information. Any relationships asserted to the whole community will be able to be seen by the original producer if they are a community member.

 

Thinking about it we could still enable the consumers to ask the community/group for the ID, by creating a STIX Request (broadcast to the community) that asks for a particular Object ID. Allowing this feature will ensure that the original anonymous producer is able to view the request (along with everyone else in the community/group), and if they want to, will be able to either send the STIX Response directly to the requester (unicast/decloaks the producer) or broadcast out the STIX Response with the Object to the community (broadcast/producer remains anonymous). In either case the consumer still gets what they want (the information), yet the producer can remain anonymous if they want).

Note - Allowing STIX Request/Response to do Object ID requests/replies is partially replicating the function provided by TAXII Query by Object ID, but has an important distinction – it is indirect and broadcast-based. It is analogous to a proxy-arp. It is not something that TAXII Query would be able to do as TAXII Query only works with single client to single server communication. We would need STIX requests/response functionality if we are going to be able to allow consumers to request an Object by ID indirectly using broadcasts out to the whole sharing community/group.

If a producer does include a producer_domain_name field then the consumer is able to follow the process described in option 1, and is able to directly request the Object via a TAXII Query by Object ID.

Option b will work irrespective of where the data is encapsulated within the object structure. It was mentioned to me that Gary Katz had suggested extracting the producer’s domain_name field into a separate ‘exports’ data structure within the message – independent of the object itself. This could work, but I would discourage it. In my opinion all data about an object should be kept with that object. It’s the concept of tight cohesion, loose coupling. It makes it far easier to ‘move’ the object around as a whole complete thing rather than having bits of the object strewn across the structure. I believe that the producer_domain_name should live right next to the ID.

 

Terrys Recommendation: I believe that option 2 represents the most flexible option for both anonymous and conspicuous producers.  It allows an indirect, broadcast based request for Object by ID/response containing Object to allow anonymous producers to still communicate, yet allows the standard process of direct unicast TAXII Queries to occur.

 

Cheers

 

Terry MacDonald

Senior STIX Subject Matter Expert

SOLTRA | An FS-ISAC and DTCC Company

+61 (407) 203 206 | terry@soltra.com

 

--

* Not so wee

** Not so little

 

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]