virtio-comment message

Subject: Re: [RFC 2/2] spec/vhost-user spec: Add IOMMU support

From: Jason Wang <jasowang@redhat.com>
To: Peter Xu <peterx@redhat.com>
Date: Thu, 13 Apr 2017 15:12:15 +0800



On 2017年04月12日 17:26, Peter Xu wrote:

On Wed, Apr 12, 2017 at 04:54:25PM +0800, Jason Wang wrote:

On 2017年04月12日 15:17, Peter Xu wrote:

On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:

On 04/11/2017 03:20 PM, Peter Xu wrote:

On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:

[...]

+slave is expected to reply with a zero payload, non-zero otherwise.

Is this ack mechanism really necessary? If not, not sure it'll be nice
to keep vhost-user/vhost-kernel aligned on this behavior. At least
that'll simplify vhost-user implementation on QEMU side (iiuc even
without introducing new functions for update/invalidate operations).

I think this is necessary, and it won't complexify the vhost-user
implementation on QEMU side, since already widely used (see reply-ack
feature).

Could you provide file/function/link pointer to the "reply-ack"
feature? I failed to find it myself.

This reply-ack mechanism is used to obtain a behaviour closer to kernel
backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
is blocked in the write() while the message is being processed in the
Kernel. With user backend, QEMU is unblocked from the write() when the
backend has read the message, before it is being processed.

I see. Then I agree with you that we may need a synchronized way to do
it. One thing I think of is IOMMU page invalidation - it should be a
sync operation to make sure that all the related caches were destroyed
when the invalidation command returns in QEMU vIOMMU emulation path.

Looks not, if I understand correctly, e.g for Intel IOMMU, when QI is
enabled, this could be done asynchronously by not waiting for the completion
through wait descriptor. Vhost-kernel always implement the invalidation as a
synchronous one for simplicity, but looks like this is not needed.

IMHO, the point is guest cannot reuse that IOVA only if it sends a
invalidation wait descriptor. If without wait descriptor, the guest
should never release any IOVA range, if so, that'll be dangerous,
because the cache may still be dirty on that range on specific device.

And since guest will for sure use wait descriptor (as long as it wants
to reuse iova addresses), then we should possibly finally need a way
to synchronously invalidate IOTLB, including to vhost-user backends.

Yes, what I mean is technically we can implement this only for waitdescriptor.


Thanks

References:
- Re: [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  - From: Jason Wang <jasowang@redhat.com>