OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v3 0/2] virtio: introduce STOP status bit



å 2021/11/24 äå12:19, Eugenio Perez Martin åé:
On Tue, Nov 23, 2021 at 12:33 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
On Thu, Nov 18, 2021 at 05:49:11PM +0100, Eugenio Perez Martin wrote:
On Thu, Nov 18, 2021 at 3:45 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
On Thu, Nov 11, 2021 at 07:58:10PM +0100, Eugenio PÃrez wrote:
This patch introduces a new status bit STOP. This can be used by the
driver to stop the device in order to safely fetch used descriptors
status, making sure the device will not fetch new available ones.

Its main use case is live migration, although it has other orthogonal
use cases. It can be used to safely discard requests that have not been
used: in other words, to rewind available descriptors.
This sounds non-trivial and would require more explanation.

You are right and it's not well explained here, I will try to develop
it better for the next version. It's in the cover letter explaining
one use case proposed by MST [1] and the answer by Jason [2].

When the VQ is stopped, it is forced to flush all the descriptors (in
this version) and the used index in the case of split. With that
information, the driver can modify any available descriptor that has
not been used by the device, and to rewind the virtqueue available
index to an extent.

If we add Jason's virtqueue state (first patch of previous version),
which I need to recover in later versions, the driver can move freely
available and used index prior to queue resetting.

The (in comments) proposed solution for the device that cannot flush
its descriptor timely is to provide it with a way to report in-flight
descriptors. As a reference, vhost-user has done it before, but with a
memory region shared by a file descriptor [3]. If we add something
similar, the driver still knows what file descriptors is able to
rewind.

Does that explanation make the driver rewind use case more clear?
I understand the use case but it's not clear how the mechanism is
supposed to work. Let's continue discussing in the sub-thread where I
posted details.

Ok!

Stopping the device in the live migration context is done by per-device
operations in vhost backends, but the introduction of STOP as a basic
virtio facility comes with advantages:
* All the device virtio-specific state is summarized in a single entity,
   making easier to reason about it.
This point isn't clear to me. I think it's saying that using STOP
somehow unifies things compared to the way that vhost devices are
stopped today. Given that vhost already syncs state back to the VMM's
emulated VIRTIO device, I'm not sure how STOP is different.

It also achieves that, but it's more related to the fact that the
current way of getting the index through vhost net, user, ... is not
reusable by others methods to expose the device to VMM.
->vhost_get_vring_base() is a common interface across vhost-kernel,
vhost-user, etc and all VIRTIO Device Types. Is there a problem with it
that I'm missing?

It is usable by all VIRTIO Device types *exposed through vhost*, so it
fails to address the case when we cannot use vhost to expose the
device. It can happen with the cases you explained better than me in
[1], or when exposing a vdpa device as a pure virtio device using
virtio_vdpa.


I don't get this and may miss something. The stop is a basic facility that is required for the building block of live migration. We need that regardless what kind of software technology or framework is used.

For the case of virtio_vdpa, it's not because it can't use STOP but because we don't have a valid use case for that.

Thanks



"vhost net, user" is confusing. Do you mean vhost-kernel and vhost-user
or do you mean vhost-net and vhost-user-net?

I meant vhost-kernel and vhost-user, sorry.

Avoiding
developing a different way to stop and get the status of each kind of
device helps others devices implementations out of VMM.
What does "kind of device" mean? I think you mean vhost device types
like net, scsi, blk, vsock, etc (a subset of VIRTIO Device Types that
have been defined for vhost). As you say, they have different stop
operations, but it's not true that getting the status of a vring is
different for each one (they all use ->vhost_get_vring_base()).

Reading that way I meant all VIRTIO Device Types, since this proposal
also addresses them even if they don't use vhost-*.

I said "stop and get the status" as a the operation, but now I see
it's confusing, and I meant mostly stop as you say.

What you mention has more to do with the next bullet point.

* VMM does not need to implement device specific operations in the
   driver part.
What is the "driver part"?

The part of qemu that talks to devices using virtio through (for
example) vhost messages. This set features, get features, etc.

Each vhost device has its own method to stop the device. In networking
is setting a backend file descriptor -1, and others have their own
way.

Using the status field allows out of VMM to unify that part too.

Maybe this one and the above would be clearer if I use vhost as
examples. I will rewrite them anyhow, thanks!
Thanks. Something like "The VMM does not need to implement a different
stop operation like VHOST_NET_SET_BACKEND -1 for each device type" would
be clearer.

I will change, thanks!

* Work out of the box for devices that use pure virtio backends in some
   part of the device emulation chain (virtio_pci_vdpa or virtio_vdpa),
   in any transport the device can use.
?
vp_vdpa makes a standard virtio device exposed as a vdpa one. This
implies that each of the vhost commands sent to vhost-vdpa needs to be
converted to standard virtio request if it needs to reach the actual
device. But there is currently no way to stop the device or retrieve
its state using just virtio.

Because of that, it's also usable by a pure virtio device, like in the
case of vdpa devices using virtio_vdpa or other devices that simply
exposes itself as a virtio one with no further facilities.

It is also not restricted by the transport you use to expose the
virtio: PCI, MMIO, etc, since you need to perform operations already
defined by any usable device (set and get the status).
[1]
I see. This says STOP needs to be an in-band VIRTIO operation so that
vDPA/vhost can be stacked on top of VIRTIO devices. If STOP was only a
vhost operation then it wouldn't be possible to forward it to underlying
VIRTIO devices.

I will use that to clarify the point, thanks!

I think that all the points overlap too much, so I will try to rewrite
differently for the next version. Thank you very much for the
feedback!

Stefan



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]