OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cti message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cti] [EXT] Re: [cti] TAXII Pagination


I thought a bit more about this, and after talking with Allan, I think the limit on the server side will probably be variable based on the size of the content.  So we might need to tweak the text and make sure that the max_content_length works in both directions.  

For example, say if normally a server limits a client to 500 objects at a time.  But say some of those are really big, like a gig in size. The server may need to dynamically change the limit based on the size of the objects.  So a client would need to always check the envelope to see if there are more records.  

Thoughts ????

Bret


On Oct 9, 2019, at 11:43 AM, Vargas-Gonzalez, Emmanuelle <emmanuelle@mitre.org> wrote:

All,
 
As I was reading this proposed solution for TAXII Pagination. It occurred to me that currently a TAXII Server does not have a way of advertising his self-imposed limit for pagination requests. This way, a client can also know ahead of time its limit via the server api_root resource. This more of a different problem than the originally expressed in this thread, but related.
 
What I propose is adding a new property called max_limit and you can read the details below.
 
Property Name
Type
Description
title (required)
string
A human readable plain text name used to identify this API instance. 
description(optional)
string
A human readable plain text description for this API Root.
versions (required)
list of typestring
The list of TAXII versions that this API Root is compatible with. The values listed in this property MUST match the media types defined in Section 1.6.8.1 and MUST include the optional version parameter. A value of "application/taxii+json;version=2.1" MUST be included in this list to indicate conformance with this specification.
max_content_length(required)
integer
The maximum size of the request body in octets (8-bit bytes) that the server can support. The value of the max_content_length MUST be a positive integer greater than zero. This applies to requests only and is determined by the server. Requests with total body length values smaller than this value MUST NOT result in an HTTP 413 (Request Entity Too Large) response. If for example, the server supported 100 MB of data, the value for this property would be determined by 100*1024*1024 which equals 104,857,600. This property contains useful information for the client when it POSTs requests to the Add Objects endpoint.
max_limit(required)
integer
The maximum server imposed limit for pagination requests. The value of the max_limit MUST be a positive integer greater than zero. This only applies to pagination requests made to this api root. Any request with a limit greater than max_limit will be overridden by the server self-imposed limit.
 
Any thoughts?
 
- Emmanuelle
 
From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> On Behalf Of Allan Thomson
Sent: Friday, October 4, 2019 3:57 PM
To: Matt Pladna <mpladna@lookingglasscyber.com>; Bret Jordan <Bret_Jordan@symantec.com>; cti@lists.oasis-open.org
Subject: [EXT] Re: [cti] TAXII Pagination 
 
+1
 
Allan Thomson
CTO (+1-408-331-6646)
 
From: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> on behalf of Matt Pladna <mpladna@lookingglasscyber.com>
Date: Friday, October 4, 2019 at 12:50 PM
To: Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Subject: Re: [cti] TAXII Pagination 
 
Thanks Bret,
 
I like this approach and believe itâs a small flexible change that lets a client consume data in page sizes they want regardless of what backend the target server uses. 
 
Looking forward to feedback from others.
 
Thanks,
 
From: <cti@lists.oasis-open.org> on behalf of Bret Jordan <Bret_Jordan@symantec.com>
Date: Friday, October 4, 2019 at 15:16
To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
Subject: [cti] TAXII Pagination 
 
All,
 
In TAXII 2.1 we have a pretty good pagination solution, but it suffers from a known issue when multiple records have the same date added value. We originallytried to address this by saying that the date added value MUST be microsecond level precision. But that is not sufficient for some.  
 
As such, I have been working with Looking Glass on a potential solution that requires the least amount of changes to make this work.  After many back-and-forth versions, I think we have something that might work.  Please review. 
 
 
TAXII Pagination Proposal
 
To keep things simple, for mental visualization, we will be defining the scenarios in terms of small numbers.  But one must realize that in production, these numbers will be many orders of magnitude larger.
 
1 Fundamental Design Goals
Completely stateless for the server in the true RESTful sense
Simple way for clients to start synchronization after some point in time, without having to sync the entire collection.
Example: A collection may have billions of records in it going back 10 years. But a client really only cares about syncing or getting data from the past 6 months.
Need ability to paginate records where every record has its own date_added value
Need ability to paginate records where many records may have the same date_added value
 
2 Proposed Solution Summary
  • Add a single optional property called "next" (type: string) to the TAXII Envelope
  • Add a URL parameter called "next"
 
3 Scenario 
 
The collection has 200 indicator records, however, the first 100 records all have the same date_added timestamp
 
3.1 Problem
Our current method breaks if and only if, the client has a limit of less than 100 or the server artificially limits the records to less than 100. Under this condition the client will not get all of the records or will have inconsistent experience. 
 
3.2 Example Initial Request From Client
?added_after=2010-01-01T01:01:01.123456Z&limit=20
 
3.3 Server Processes Query Request
The server queries the datastore with a record limit of 21 records (client provided or server limited limit value + 1) that match the rest of the request
 
  1. The server checks results to see if there are 21 records returned.
    1. If NO then there are no more records that match the query and the TAXII server can send the results in a TAXII envelope to the client
      1. TAXII Envelope "more" property set to "false"
      2. TAXII Envelope "next" property is left empty
    1. If YES then there are more records and the server would respond with the following
      1. TAXII Envelope "more" property set to "true"
      2. TAXII Envelope "next" property set to a string value. For a relational database this could be the index autoID, for elastic search it could be the Scroll ID, for other systems it could be a cursor ID, or it could be any string (or int represented as a string) depending on the requirements of the server and the black magic it is doing in the background. The key is that it is something that the server knows how to deal with and process and the client only needs to send it back to the server in the next request to get more data. 
 
3.4 Example Follow On Request From Client
?added_after=2010-01-01T01:01:01.123456Z&limit=20&next=123456789
 
 
If we can verify that this does solve the issue, and is still easy to implement (I believe so) this is something that we could do for TAXII 2.1, if the TC agrees.  Yes it would require another CSD and Public Review, but it would allow us to address this last known issue.
 
 
Thoughts ????
 
Bret



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]