Subject: Re: [RFC 2/2] spec/vhost-user spec: Add IOMMU support
On 2017年04月12日 17:26, Peter Xu wrote:
On Wed, Apr 12, 2017 at 04:54:25PM +0800, Jason Wang wrote:On 2017年04月12日 15:17, Peter Xu wrote:On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:On 04/11/2017 03:20 PM, Peter Xu wrote:On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:[...]+slave is expected to reply with a zero payload, non-zero otherwise.Is this ack mechanism really necessary? If not, not sure it'll be nice to keep vhost-user/vhost-kernel aligned on this behavior. At least that'll simplify vhost-user implementation on QEMU side (iiuc even without introducing new functions for update/invalidate operations).I think this is necessary, and it won't complexify the vhost-user implementation on QEMU side, since already widely used (see reply-ack feature).Could you provide file/function/link pointer to the "reply-ack" feature? I failed to find it myself.This reply-ack mechanism is used to obtain a behaviour closer to kernel backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it is blocked in the write() while the message is being processed in the Kernel. With user backend, QEMU is unblocked from the write() when the backend has read the message, before it is being processed.I see. Then I agree with you that we may need a synchronized way to do it. One thing I think of is IOMMU page invalidation - it should be a sync operation to make sure that all the related caches were destroyed when the invalidation command returns in QEMU vIOMMU emulation path.Looks not, if I understand correctly, e.g for Intel IOMMU, when QI is enabled, this could be done asynchronously by not waiting for the completion through wait descriptor. Vhost-kernel always implement the invalidation as a synchronous one for simplicity, but looks like this is not needed.IMHO, the point is guest cannot reuse that IOVA only if it sends a invalidation wait descriptor. If without wait descriptor, the guest should never release any IOVA range, if so, that'll be dangerous, because the cache may still be dirty on that range on specific device. And since guest will for sure use wait descriptor (as long as it wants to reuse iova addresses), then we should possibly finally need a way to synchronously invalidate IOTLB, including to vhost-user backends.
Yes, what I mean is technically we can implement this only for wait descriptor.