OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH V2 0/2] Introduce VIRTIO_F_QUEUE_STATE


On Wed, Mar 24, 2021 at 03:05:30PM +0800, Jason Wang wrote:
> 
> å 2021/3/23 äå6:40, Stefan Hajnoczi åé:
> > On Mon, Mar 22, 2021 at 11:47:15AM +0800, Jason Wang wrote:
> > > This is a new version to support VIRTIO_F_QUEUE_STATE. The feautre
> > > extends the basic facility to allow the driver to set and get device
> > > internal virtqueue state. This main motivation is to support live
> > > migration of virtio devices.
> > Can you describe the use cases that this interface covers as well as the
> > steps involved in migrating a device?
> 
> 
> Yes. I can describe the steps for live migrating virtio-net device. For
> other devices, we probably need other state.

Thanks, describing the steps for virtio-net would be great.

> 
> 
> >   Traditionally live migration was
> > transparent to the VIRTIO driver because it was performed by the
> > hypervisor.
> 
> 
> Right, but it could be possible that we may want live migrate between
> hardware virtio-pci devices. So it's up to the hypversior to save and
> restore states silently without the notice of guest driver as what we did
> for vhost.

This is where I'd like to understand the steps in detail. The set/get
state functionality introduced in this spec change requires that the
hypervisor has access to the device's hardware registers - the same
registers that the guest is also using. I'd like to understand the
lifecycle and how conflicts between the hypervisor and the guest are
avoided (unless this is integrated into vDPA/VFIO/SR-IOV in a way that I
haven't thought of?).

> 
> 
> > 
> > I know you're aware but I think it's worth mentioning that this only
> > supports stateless devices.
> 
> 
> Yes, that's why it's a queue state not a device state.
> 
> 
> >   Even the simple virtio-blk device has state
> > in QEMU's implementation. If an I/O request fails it can be held by the
> > device and resumed after live migration instead of failing the request
> > immediately. The list of held requests needs to be migrated with the
> > device and is not part of the virtqueue state.
> 
> 
> Yes, I think we need to extend virtio spec to support save and restore
> device state. But anyway the virtqueue state is the infrastructure which
> should be introdouced first.

Introducing virtqueue state save/load first seems fine, but before
committing to a spec cange we need an approximate plan for per-device
state so that it's clear the design can be extended to cover that case
in the future.

> > 
> > I'm concerned that using device reset will not work once this interface
> > is extended to support device-specific state (e.g. the virtio-blk failed
> > request list). There could be situations where reset really needs to
> > reset (e.g. freeing I/O resources) and the device therefore cannot hold
> > on to state across reset.
> 
> 
> Good point. So here're some ways:
> 
> 
> 1) reuse device reset that is done in this patch
> 2) intorduce a new device status like what has been done in [1]
> 3) using queue_enable (as what has been done in the virtio-mmio, pci forbids
> to stop a queue currently, we may need to extend that)
> 4) use device specific way to stop the datapath
> 
> Reusing device reset looks like a shortcut that might not be easy for
> stateful device as you said. 2) looks more general. 3) have the issues that
> it doesn't forbid the config changed. And 4) is also proposed by you and
> Michael.
> 
> My understanding is that there should be no fundamental differences between
> 2) and 4). So I tend to respin [1], do you have any other ideas?

2 or 4 sound good. I prefer 2 since one standard interface will be less
work and complexity than multiple device-specific ways of stopping the
data path.

3 is more flexible but needs to be augmented with a way to pause the
entire device. It could be added on top of 2 or 4, if necessary, in the
future.

Stefan

Attachment: signature.asc
Description: PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]