virtio-dev message

Subject: Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support

From: Babis Chalios <bchalios@amazon.es>
To: "Michael S. Tsirkin" <mst@redhat.com>
Date: Thu, 2 Nov 2023 14:42:23 +0100

On 2/11/23 12:51, Michael S. Tsirkin wrote:


On Thu, Nov 02, 2023 at 12:38:28PM +0100, Babis Chalios wrote:

On 2/11/23 12:20, Michael S. Tsirkin wrote:

On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:

On 27/9/23 23:47, Michael S. Tsirkin wrote:

On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:

On 22/9/23 18:01, Michael S. Tsirkin wrote:

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:

On 22/9/23 17:06, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:

On 19/9/23 12:01, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:

Resending to fix e-mail formatting issues (sorry for the spam)

On 18/9/23 18:30, Babis Chalios wrote:

Yes, that's what the driver does now in the RFC patch.
However, this just
decreases
the race window, it doesn't eliminate it. If a third
leak event happens it
might not
find any buffers to use:

1. available buffers to queue 1-X
2. available buffers to queue X


3. poll queue X
4. used buffers in queue X       <- leak event 1 will
use buffers in X
5. avail buffers in queue X
6. poll queue 1-X                <- leak event 2 will
use buffers in 1-X
7. used buffers in queue 1-X
8. avail buffers in queue 1-X
                                         <- leak event 3 (it
needs buffers in X, race with step 5)
9. goto 3

I don't get it. we added buffers in step 5.

What if the leak event 3 arrives before step 5 had time to
actually add the
buffers in X and make
them visible to the device?

Then it will see a single event in 1-X instead of two events.  A leak is
a leak though, I don't see does it matter how many triggered.

So the scenario I have in mind is the following:

(Epoch here is terminology that I used in the Linux RFC. It is a value
maintained by random.c
that changes every time a leak event happens).

1. add buffers to 1-X
2. add buffers to X
3. poll queue X
4. vcpu 0: get getrandom() entropy and cache epoch value
5. Device: First snapshot, uses buffers in X
6. vcpu 1: sees used buffers
7. Device: Second snapshot, uses buffers in 1-X
8. vcpu 0: getrandom() observes new  epoch value & caches it
9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
has not yet finished adding new buffers).
10. vcpu 1 adds new buffer in X
11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.


In this succession of events, when the third snapshot will happen, the
device won't find
any buffers in either queue, so it won't increase the RNG epoch value. So,
any entropy
gathered after step 8 will be the same across all snapshots. Am I missing
something?

Cheers,
Babis

Yes but notice how this is followed by:

12. vcpu 1: sees used buffers in 1-X

Driver can notify getrandom I guess?

It could, but then we have the exact race condition that VMGENID had,
userspace has already consumed stale entropy and there's nothing we
can do about that.

Although this is indeed a corner case, it feels like it beats the purpose
of having the hardware update directly userspace (via copy on leak).

How do you feel about the proposal a couple of emails back? It looks to
me that it avoids completely the race condition.

Cheers,
Babis

It does. The problem of course is that this means that e.g.
taking a snapshot of a guest that is stuck won't work well.

That is true, but does it matter? The intention of the proposal
is that if it is not safe to take snapshots (i.e. no buffers in the
queue) don't take snapshots.

I have been thinking of adding MAP/UNMAP descriptors for
a while now. Thus it will be possible to modify
userspace memory without consuming buffers.
Would something like this solve the problem?

I am not familiar with MAP/UNMAP descriptors. Is there
a link where I can read about them?

Cheers,
Babis

Heh no I just came up with the name. Will write up in a couple
of days, but the idea is that driver does get_user_pages,
adds buffer to queue, and device will remember the address
and change that memory on a snapshot. If there are buffers
in the queue it will also use these to tell driver,
but if there are no buffers then it won't.

That sounds like a nice mechanism. However in our case the page
holding the counter that gets increased by the hardware is a kernel
page.

The reason for that is that things other than us (virtio-rng) might
want to notify for leak events. For example, I think that Jason
intended to use this mechanism to periodically notify user-space
PRNGs that they need to reseed.

Cheers,
Babis

Now I'm lost.
when you write, e.g.:
4. vcpu 0: get getrandom() entropy and cache epoch value
how does vcpu access the epoch?

The kernel provides a user space API to map a pointer to the epoch
value. User space then caches its value and checks it every time it
needs to make sure that no entropy leak has happened before using
cached kernel entropy.

virtio-rng driver adds a copy on leak command to the queue for
increasing this value (that's what we are speaking about in this thread).
But other systems might want to report "leaks", such as random.c
itself.

Cheers,
Babis

This idea would be fine but I don't see how it's so different
from VMGENID. Care to explain?

It is different in that the memory is owned by the guest kernel, not
the hardware. In this case, random.c maintains it. This allows, systems
other than the hardware, e.g. virtio-rng to notify about entropy leak
events. For example, random.c itself can periodically do that (I think
Jason had that use-case in mind).

No sure I understand. Example?


Sure!

Take for example a user-space PRNG that uses getrandom() for seeding.
We would like that this PRNG knows when it is not safe any more to return
new random bits, because the initial seed is not unique any more.
In the Linux patch, we use the copy-on-leak command to do that. What the
copy command does in this case, is increase the value of a memory location.
That memory location is maintained by random.c. random.c allows user
space to mmap the page that stores this value. A user-space PRNG would
then mmap this value and observe it for changes.

Now, there is no reason why only virtio-rng should inform user space PRNGs
about entropy leak events. For example, the kernel PRNG (which is handling
out seeds for the user-space PRNG) might want to mark randomness that it

returned earlier as no longer valid (for reasons unrelated to VMs). WithVMGENID

this would not be possible. However, now that random.c owns that piece of
memory, more than one entities can use it to notify user space about stale

entropy. And they can work together (details on how we do that are inthe cover

letter of the LKML patch), we can have both the virtio-rng and the random.c
updating that value.

I hope that makes sense and I am really happy to discuss all this, butdo you

think it is related with the VirtIO protocol?

Cheers,
Babis

The fact that we as well mmap that memory to the user-space is just
for giving user space a mechanism that allows it to know when it needs
to reseed its PRNGs. We _could_ have done the same with VMGENID,
but the fact that, in this case, the underlying physical memory is owned
by the ACPI device blocks us from letting other systems sending
notifications
as well.

Maybe it makes sense to have this discussion in the patch we sent for
Linux on LKML [1]?

Cheers,
Babis

[1] https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/#mb1242999d8296169d9a4ee1a0805005633ec146a


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

References:
- Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  - From: Babis Chalios <bchalios@amazon.es>
- Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  - From: "Michael S. Tsirkin" <mst@redhat.com>