RE: [pkcs11] Async Operations

Hi Darren,

Iâve posted an updated version of the proposal (v5) that, hopefully, addresses the memory management concerns. Thank you very much for the feedback. I really appreciate it!

Sincerely,

Jonathan

From: JOHNSON Darren <darren.johnson@thalesgroup.com>
Sent: Thursday, February 23, 2023 1:01 PM
To: Jonathan Schulze-Hewett <schulze-hewett@infoseccorp.com>; pkcs11@lists.oasis-open.org
Subject: RE: [pkcs11] Async Operations - comments

Hi Jonathan

sorry for the delayed response.

Most of you responses pointed out my mistakes and clarified a lot of details.

However I have more follow up questions related to buffer management/ownership.

Thanks

Darren

From: Jonathan Schulze-Hewett <schulze-hewett@infoseccorp.com>
Sent: Friday, February 17, 2023 2:35 PM
To: JOHNSON Darren <darren.johnson@thalesgroup.com>; pkcs11@lists.oasis-open.org
Subject: RE: [pkcs11] Async Operations - comments

Hi Darren,

Thanks for the detailed feedback. It is much appreciated.

Tokens must support synchronous sessions. We wanted to be clear that tokens must continue to support the current way of doing things and not switch to only supporting async sessions.

[DJ] sorry. I completely misread that sentence. I read âTokens must support asynchronous sessionsâ, which didnât make any sense, hence why I asked the question J

I agree 100%. Tokens MUST continue to support the existing functionality.

Thanks

The input to C_AsyncComplete is CK_ASYNC_DATA_PTR so C_AsyncComplete can update any of the fields in the structure without them needing to be pointers. That being said, by the same logic, CK_OBJECT_HANDLE_PTR should just be CK_OBJECT_HANDLE. Iâll update.

[DJ] right. Again, I completely misread that. I was reading way too many specs that dayâ

Its CK_ASYNC_DATA_PTR, so it is updatable.

Thanks

The original buffer refers to the buffer supplied to whatever function was called. I think itâs always a buffer input to a function to contain the âoutput valueâ (i.e., from C_SignFinal, C_Encrypt, etc.). I view this as a buffer input to the function, but youâre seeing it as a buffer to contain output. Iâm happy to call it whatever makes it clearer to everyone. The values in CK_ASYNC_DATA are not input values. They are exclusively output values to be filled in by the C_AsyncComplete function. ulValue and pValue are for the callerâs convenience so that the caller does not have to keep track of the buffer. If C_AsyncComplete returns CKR_PENDING what happens to the CK_ASYNC_DATA structure is undefined. If anyone has language that would help improve the clarify of this Iâll add it in.

[DJ] Right. Now that I understand things that makes sense.

First, I agree. this is an input parameter to the function, and it receive the output/result. The confusion here was mostly based on my misreading the spec.

What isnât clear in the spec is that this buffer is held by the provider and that when C_AsyncComplete is called, the buffer is populated with the response and the pResult->pValue is assigned the pointer to that buffer.

I didnât fully understand that until I read your response to (7) below. Maybe this should be added to an extended explanation on how to interpret CKR_PENDING. Or a dedicated section on how asynchronous operations work as a whole.

The second part of your response, you are saying the application doesnât need to keep track of the buffer. pValue and ulValue are for the callers convenience. I think that makes sense now.

How does this convenience part work if I decide I need to call C_AsyncGetId for whatever reason? If the application hasnât tracked that pointer, why would it if we state it would be returned later, it means the application canât get it back. So an application MUST decide up front if it will ever call C_AsyncGetId and use that to decide if it needs to track the buffer or not?

For the last question you asked above, I think a simple statement that when CKR_PENDING is returned, no data will be written to the CK_ASYNC_DATA structure. Or inversely, until an asynchronous operation is complete, no data will be written to the memory pointed to by the CK_ASYNC_DATA _PTR. For example,

C_AsyncComplete checks if the function identified by pFunctionName has completed an asynchronous operation and, if so,

returns the associated result(s). hSession is the sessionâs handle; pFunctionName is the name of the function whose state is

being queried; pResult is a pointer to a structure to receive the result(s) if the function has completed. If the operation has

not completed, CKR_PENDING is returned and the location pointed to by pResult is not modified.

Iâm not the best spec writer, so others may be able to further fix it up. Or we can perfect it when it is incorporated in to the final spec.

And based on comments above and below, maybe the pBuffer is always returned?

The intent of the red text is to cover a reboot or restart of the calling application. To enable an application to start an operation, quit, and then check on it later. We did not have a discussion around authentication requirements and itâs up to the module as to what restrictions to place on the session trying to join. We have left the meaning of the ID up to the particular module. My goal was to specify it to the point where it was possible to implement the functionality and leave the details to the implementers.

[DJ] makes sense. Iâm not sure how, but it might be worth highlighting that this survives a call to C_Finalize and can be resumed after C_Initialize.

I think it would be relatively easy to define some rules related to authentication requirements, but it would probably clash with how various vendors implement authentication/authorization. I agree to leave it as it is.

Indeed. Iâll fix.
pData and ulData are input to C_AsyncJoin so that they can be returned in the future C_AsyncComplete call. They can be NULL if the original function, say C_GenerateKey pair (which is what got me into this mess in the first place), doesnât take a buffer. Something stating that implementerâs should carefully consider whether or not to use the supplied buffers as the caller may quit on them seems reasonable to me. Anyone care to suggest some language?

[DJ] Now that I understand how this works, that makes sense.

I donât think we need to make a statement about the implementers not using the buffer. The whole model of these APIs is that the buffer is passed to the library and remains in the libraries control. However I think we should define when ownership of the buffer is returned back to the calling application. That part is not necessarily clear. I think if we just make it clear that when an operation is being processed asynchronously, the output buffer is under the control of the provider until a few different possible events happen:

C_AsyncComplete returns any other error code than CKR_PENDING. Iâm saying this in part because of how you described handling BUFFER_TOO_SMALL later in the email. If the application must resize the buffer and recall the previous API, then it must also receive the original pointer back so that it can be freeâd.
until C_AsycnGetID is called and returns CKR_OK.
C_SessionCancel is called with the appropriate flags.
C_CloseSession is called.
C_Finalize is called. If that is made clear, then I donât think we should allow the application to arbitrarily delete memory that it has passed to the library.

In most of these scenarios, there is no obvious way for the application to get the pointer back from the library.

Alternatively, if we are worried about this, then the APIs should probably be restructured so that the memory is not left under the providers control. To be honest, I donât see what value there is in passing the buffer to the library and getting the pointer back later. If the library isnât supposed to use it until the operation is complete, that removes the benefit for the library. And application that is using asynchronous APIs will need to track the operation. So tracking buffers along with the operation is not going to be any additional complexity and would probably be a very natural thing for the application to do. Another thing, an application that allocates memory, passes the address to a library and then just âforgetsâ about it causes many problems when it comes to code quality and static analysis tools. And finally, it is inconsistent that an application may provide multiple response pointers (i.e. pointer for the data and a pointer for the length of the data), yet only one pointer is held by the library. I understand why you make that distinction, but it still results in a bit of inconsistency.

As in 6, pData and ulData are input to C_AsyncJoin so that they can be returned in the future C_AsyncComplete call.

[DJ] Right. thatâs clear now. thanks.

The big thing that was confusing to me at first is the whole C_AsyncComplete NOT taking input values. My first example had C_AsyncComplete returning CKR_BUFFER_TOO_SMALL and then being called again with a properly sized buffer, but thatâs not how it would work (thank you Dieter for explaining my own proposal to me). If C_AsyncComplete returns CKR_BUFFER_TOO_SMALL, the application needs to call the original function again with a properly sized buffer (C_SignFinal then C_AsyncComplete then C_SignFinal again with proper buffer then C_AsyncComplete).

[DJ], this dynamic certainly needs to be better explained in the spec.

To me, if an API returns CKR_BUFFER_TOO_SMALL. the obvious thing to do would be to call the same API with the corrected buffer size. That is how most (all?) the other APIs work. But in this case, we are saying no, you need to go back and call the previous API you called. I can see many reasons why this could make sense as they ARE going back to the actual API for the operation. But at the same time it opens more questions. Does the exact input data need to be provided as well? Maybe it does if we compare it to the non-asynchronous flow use for EncryptUpdate when the buffer is too small.

Iâm not saying we should change it. Iâm just saying that in my opinion (right or wrong) it seems like a deviation on how all the other APIs work so it should be well documented.

So when CKR_BUFFER_TOO_SMALL is returned, as I suggested in (6) aboveâ does that mean the originally provided buffer is still returned? If we are expecting application to NOT keep that pointer around, then the provider would have to return it. Otherwise the application would never be able to free it. The whole idea that the provider will keep the buffer and return it, and when it returns it should be well explained as well.

Iâll update the doc with changes for 2 and 5 above and with any changes arising from the next email on the subject 😊 If anyone has suggestions for language on 3 and 6 please send it!

Sincerely,

Jonathan

From: pkcs11@lists.oasis-open.org <pkcs11@lists.oasis-open.org> On Behalf Of JOHNSON Darren
Sent: Wednesday, February 15, 2023 9:35 AM
To: pkcs11@lists.oasis-open.org
Subject: [EXT][pkcs11] Async Operations - comments

THIS MESSAGE COMES FROM AN EXTERNAL SOURCE. PLEASE VERIFY THE CONTENTS OF THIS MESSAGE BEFORE PROCEEDING.

I missed all the discussions around this proposal. So I apologize if Iâm asking questions that were already discussed and resolved.

in section 4.b, it states âIf the token does not support asynchronous operations, it should return CKR_SESSION_ASYNC_NOT_SUPPORTED. Tokens must support synchronous sessions. Tokens may support asynchronous sessions and may return CKR_PENDING if the token determines that the operation will take a long time to conclude.â .

I assume the first part, âTokens must support synchronous sessionsâ is going to be removed? Tokens may support async sessions.

In CK_ASYNC_DATA, the field ulValue is defined as CK_ULONG. But shouldnât it be CK_ULONG_PTR and pulValueLen as this is going to receive the data and is expected to follow the conventions for all functions returning data?

In CK_ASYNC_DATA, pValue is defined as âon completion contains a pointer to the original input buffer, caller is responsible for freeing this memoryâ.

Is this supposed to say âoriginal input bufferâ? Or should it be âoriginal output bufferâ?

Whether the operation is âon completionâ or âstill processingâ, this value should always âpoints to the location that receives the resultâ. The state of the operation is unknown to the caller, and should not have an impact on the value of âpValueâ, Right?

On completion, the result is copied there.

Do we really require that it is the âsameâ buffer? Does that imply that we are assuming/expecting or just allowing providers to make use of that buffer during the operation? Which may make sense for memory constrained environments. and may also make sense if a token is streaming result data out and filling this buffer as it progresses.

Or should we be more strict and not have this requirement for the âsameâ buffer, which means the provider should not use any of the applications memory. This would be problematic in many cases I think

Regardless, we should clearly state who is the owner of that buffer while the operation is being processed asynchronously.

âIfâ it must be the same buffer, should we really require it to be re-provided? Should we require any of the output pointers to be re-provided? What is the reason for them to be re-provided?

In the summary/introduction, it states âreestablish the application/module connection with respect to the asynchronous operationâ. To be clear, this feature is only covering the case where a session is closed, and the operation is joined/connected to a different session; be it a newly opened session or an existing session. All of this is within the context of single successful calls to C_Initailize->C_Finalize . There is no expectation for this to span successful calls to C_Finalize->C_Initalize, right?

Looking back at the meeting minutes, I see that this update to the proposal is supposed to cover a reboot. Is that true? So an application can be restarted, and attempt to join the operation?

Was there any discussion around authentication requirements? About which role/user initiated the operation vs what role/user is joining the operation? I would assume that only the same role (and user if a user was authenticated) could resume the operation. Or is the âIDâ value essentially treated like an authorization token and grants rights for any role/user to join the operation? Whatever the answer is, it should be documented.

What about the current roleâs/userâs ability to access and use the key that is currently being used for the operation. Without requiring the same role/user that was used to initiate the operation, we get in to discussions around escalation.

C_AsyncGetID description use âpResultâ instead of âpulIDâ.

C_AsyncGetID. Maybe this is just a providerâs problem to deal with and may feed in to their decision to allow this call to succeed or not. Related to the questions about the result buffer, the wording above suggests that a provider would be able to use the result buffer when handling asynchronous operationsâ because the call to C_AsyncComplete must pass in the same pointer. So then after a successful call to C_AsyncGetID , the original memory pointed to by the result point (be it form the original call, or any following calls to C_AsyncComplete) must no longer be referenced by the provider.

This may not be obvious to all readers and should probably be mentioned.

C_AsyncJoin; the field ulValue is defined as CK_ULONG. But shouldnât it be CK_ULONG_PTR and pulValueLen as this is going to receive the data and is expected to follow the conventions for all functions returning data? Also, if no output is returned, isnât it better to set the pulDataLen value to 0?

This question may be ignored as it may be addressed by the next question below.

Why should C_AsyncJoin need to provide pData and ulData? From what I understand, the application will need to call C_AsyncComplete, right? so the output buffer will be provided via CK_ASYNC_DATA? Requiring pData and ulData during the C_AsyncJoin would only be necessary if we think the provider will need to make use of that memory while processing the operation. But the operation has gone through a C_AsyncGetID, so it must already have decoupled itself from any application memory space and is presumably still processing

Shouldnât C_AsyncJoin pass CKA_ASYNC_DATA as a parameter so that all of the output locations are provided and not just the buffer?

It seems to me that C_AsyncJoin should just join the operation followed by C_AsyncComplete, which will provide all the output pointers/buffers and attempt to complete the operation.

Thanks

Darren

pkcs11 message