Re: [provision] Synchronization Use Cases

On Nov 8, 2010, at 10:55 AM, Richard Sand wrote:

Some rough notes from us about the synchronization use cases we’d like to add to the use case discussions.

Richard Sand | CEO
239 Kings Highway East | Haddonfield | New Jersey 08033 | USA
Mobile: +1 267 984 3651| Office: +1 856 795 1722| Fax: +1 856 795 1733

<image001.jpg>

From: Daniel A. Perry
Sent: Friday, October 08, 2010 5:39 PM
To: Richard Sand
Subject: Synchronization Use Cases

Please let me know your comments:

-- Synchronization Use Cases --

General Scenario: A hosting provider and customer need to exchange updates to identity information bi-directionally. The user base being dealt is assumed to be quite large.

Synchronization Case 1: Customer needs to send a number of updates to the provider. This is trivial to do with the current SPML standard.

Good so far. I assume the Customer would send requests (e.g., AddRequest, ModifyRequest, DeleteRequest, DisableRequest, EnableRequest) to hosting provider.

Synchronization Case 2: Customer needs to pull changes made to the identities in the hosting provider back into their own systems. When dealing with a large user base, it’s very inefficient to have to search for changes by comparison, and searching for users off modification can be unreliable. The service provider should maintain a revision number for the user repository, and should support this revision number in a manner to allow only the changed users to be queried.

The Updates Capability that is defined as an optional part of SPMLv2 allows a requester to query for only changed objects. However, the Updates Capability as currently specified relies on the timestamp of the most recent revision to each object (rather than relying on a revision-number as described here).

We specified timestamp rather than revision-number in order to minimize the burden on the provider and to minimize the coupling between the requester and provider. (Discussion of the possible formats for version number caused us also to considered such options as having the provider return an opaque token that the requester would persist and then pass in to subsequent requests. An opaque token muddied the semantics of the request. The requirement to persist any token other than a primitive or a character-string increased the coupling between requester and provider.)

A monotonically increasing integer was the only other format for a revision number that was considered to be clearer than a timestamp, but maintaining a monotonically increasing integer-sequence was considered to impose a burden on any provider with a back-end storage mechanism that was distributed or clustered. Timestamp had a further advantage in that, if the requester lost (or could not persist) the most recent return value, a human could reasonably supply an appropriate value.

</history>

The primary advantage of a (monotonically increasing integral) revision-number is granularity. With any timestamp (or with any revision-number) that is not required to be unique, there is always the possibility of two objects having the same revision-timestamp. In this case, the requester must be prepared to handle in an UpdatesResponse object-changes that the requester already has seen.

Synchronization Case 3: Case 3 is a combination of cases 1 and 2. When doing bi-directional synchronization both cases 1 and 2 will be performed. Use case is an optimization where the synchronization is started with the customer sending a batch of updates to the provider, and also including the revision that it last successfully synchronized with the provider. The provider responds with a list of changes that since the revision number, and also with the current revision number as per use case2.

This appears to be a combo-method: in SPMLv2 terms a BatchRequest containing change-requests and an UpdatesRequest asking the Provider to report recent object-changes. Do I understand this correctly?

This optimization appears to save one network-round-trip and the XML processing of one request-response pair. Are there other benefits as well?

These optimizations appear to come at the expense of increased complexity, especially in error-handling. If all goes well, would the set of changes that the provider returns to the requester include those changes that the requester just requested? If there is an error in the BatchRequest that contains change-requests, does the provider still reply with the equivalent of the UpdatesResponse? If so, the requester must carefully examine that Response to determine which requests resulted in error(s), and which requests it must retry subsequently. This would complicate the implementations of both the provider and the requester. (In addition, there are the usual limitations of the BatchRequest with respect to multiple-change-requests and with respect to concurrency. If a subsequent modification requires a PSOID that a previous request returned (i.e., generated or modified), or that a concurrent request modified, that subsequent modification will fail.)

SPMLv2 allows any provider to define custom capabilities (such as, for example, one that includes this combo-method). However, the classic SPMLv2 approach would have been to push changes and to poll for changes *separately*. As I imagine this, a requester might first poll for changes until the requester "caught up" to the provider, in order to minimize his chances of error due to concurrent modification. How important is the optimization of pushing and pulling in a single request-response pair?

Synchronization Case 4: Case 4 is further refinement of use case 3.   In case 3 only the provider was keeping trying of a revision number.    Use case 4 adds a revision number to both sides.      This allows for a simple three message exchange which either party (customer or provider) can initiate to achieve a full synchronization.    The message exchange is as follows:
1)      Initiating party requests synchronization, and specifies the current revision of the other’s database that they have.
2)      Other party responds with all changes since that revision.    That same message asks the initiator for all changes since the last applied revision.
3)      Initiator responses with all changes since the last revision.
Then both parties can asynchronously process those changes, and only update the current-applied revision number which they store for the other party when those changes are successful.   (This model is similar to the retransmit mechanism in TCP).    Finally, the last-revision sent to retrieve all changes since then will need to be an expression.    This allows for cases where a few changes fail to apply during synchronization and need to be retried. Next are a few sub-cases delaying with retries:

Case 4a) Repeated retries.   In this case, the last-applied revision is a simple expression such as “>X”, indicating that the initiating party has applied all changes up to revision X, and wish to retrieve all changes after X.    The issue with approach is that if a mixture of success synchronizations and failures occur that those events will be retried again. For some targets this may cause problems, for other it may not and may save the extra effort of keeping track of exactly what failed.

Case4b) Specified Retries.   In this case, the initiating party needs to keep track of exactly what changes applied successfully and which need to be retried.    The initiator will request a set of changes like “A-C,E,>G” (requesting that A through C and E be retried, and any new events after G be sent as well); skipping the changes that have been applied successfully.   This allows the client to avoid receiving duplicate events from a previous run, but puts the burden of keeping details about exactly what failed on the client.

The responder needs to support either of syntaxes for querying changes, leaving the decision for “simple with duplicates” vs. “complex with no duplicates” up to the initiating client.

This is very interesting. I'd probably have to re-read this a few more times (especially the retry-related sub-cases) to make sure that I understand it fully, but speaking broadly it sounds like another combo-method that requires the requester to support the Updates Capability as if that requester were a provider.

Could the updates from both sides possibly overlap? Wouldn't an orderly and systematic application of object-changes be simpler if one system or the other maintained initiative?

SPMLv2 allows any provider to define custom capabilities (such as, for example, one that includes this mega-combo-method). However, the classic SPMLv2 approach would have been for the Customer to poll for changes from the hosting provider and the hosting provider to poll for changes from the Cusotmer *separately*. Each takes a turn acting as requester and requesting Updates from the other (who acts as provider). This seems like four messages rather than the three you describe (assuming that the initiator's "response" in step 3 requires no confirmation).

Are we doing something in Synchronization Case 4 that SPMLv2 could not, or is optimization the benefit? If optimization is the benefit, how much simpler or faster is this kind of approach (and at what cost of complexity in operational semantics and in implementation)?

Gary

provision message