OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, October 10, 2023 7:30 PM

> > The hypervisor driver composes the vPCI device. So there isnât a need to
> migrate the pci state.
> > Only exception is VIRTIO_PCI_CAP_PCI_CFG, which is covered in this v1.
> >
> 
> yes but what seems implicit is that device is in some reasonable state when
> thing thing happens. e.g. are there no limitations at all e.g. in which order
> things happen? can you really first configure virtio then pci config? for sure?
> 
First pci config is setup, like bus master enable etc.
After that point, the device is handed to virtio things.

From device context write perspective, I doubt the order matters.
For example, if pci bus master and msix are enabled after device context restore or before would not matter much.
As long as they are done before making the device mode to active.

> > > > > >
> > > > > > > >and device configuration space may change. \\
> > > > > > > > +\hline
> > > > > > >
> > > > > > > I still don't get why we need a "stop" state in the middle.
> > > > > > >
> > > > > > All pci devices which belong to a single guest VM are not
> > > > > > stopped
> > > atomically.
> > > > > > Hence, one device which is in freeze mode, may still receive
> > > > > > driver notifications from other pci device,
> > > > >
> > > > > Device may choose to ignore those notifications, no?
> > > > >
> > > > > > or it may experience a read from the shared memory and get
> > > > > > garbage
> > > data.
> > > > >
> > > > > Could you give me an example for this?
> > > > >
> > > > Section 2.10 Shared Memory Regions.
> > > >
> > > > > > And things can break.
> > > > > > Hence the stop mode, ensures that all the devices get enough
> > > > > > chance to stop
> > > > > themselves, and later when freezed, to not change anything internally.
> > > > > >
> > > > > > > > +0x2   & Freeze &
> > > > > > > > + In this mode, the member device does not accept any
> > > > > > > > +driver notifications,
> > > > > > >
> > > > > > > This is too vague. Is the device allowed to be freezed in
> > > > > > > the middle of any virtio or PCI operations?
> > > > > > >
> > > > > > > For example, in the middle of feature negotiation etc. It
> > > > > > > may cause implementation specific sub-states which can't be
> migrated easily.
> > > > > > >
> > > > > > Yes. it is allowed in middle of feature negotiation, for sure.
> > > > > > It is passthrough device, hence hypervisor layer do not get to
> > > > > > see sub-
> > > state.
> > > > > >
> > > > > > Not sure why you comment, why it cannot be migrated easily.
> > > > > > The device context already covers this sub-state.
> > > > >
> > > > > 1) driver writes driver_features
> > > > > 2) driver sets FEAUTRES_OK
> > > > >
> > > > > 3) device receive driver_features
> > > > > 4) device validating driver_features
> > > > > 5) device clears FEATURES_OK
> > > > >
> > > > > 6) driver read stats and realize FEATURES_OK is being cleared
> > > > >
> > > > > Is it valid to be frozen of the above?
> > > > No. device mode is frozen when hypervisor is sure that no more
> > > > access by the
> > > guest will be done.
> > > > What can happen between #2 and #3, is device mode may change to stop.
> > > > And in stop mode, device context would capture #5 or #4, depending
> > > > where is
> > > device at that point.
> > > >
> > > > > >
> > > > > > > And what's more, the above state machine seems to be virtio
> > > > > > > specific, but you don't explain the interaction with the
> > > > > > > device status state
> > > > > machine.
> > > > > > First, above is not a state machine.
> > > > >
> > > > > So how do readers know if a state can go to another state and when?
> > > > >
> > > > Not sure what you mean by reader. Can you please explain.
> > > >
> > > > > > Second, it is not virtio specific.
> > > > >
> > > > > It's somehow for sure, for example you said device context need
> > > > > to be preserved. And as far as I see the device context is all
> > > > > virtio specific in
> > > patch 3.
> > > > >
> > > > Sure, device context is virtio specific. :) Device context will
> > > > reflect if things changed in the stop mode.
> > > >
> > > > > > It is present in leading OS that has fundamental requirement
> > > > > > to support P2P
> > > > > devices.
> > > > >
> > > > > If it's PCI specific, instead of trying to do a workaround in
> > > > > virtio, why not invent a mechanism there?
> > > > >
> > > > It is not a workaround in virtio.
> > > > It is the way pci p2p devices work for which one needs to be
> > > > receptive to
> > > handle the interaction.
> > > >
> > > >
> > > > > > Third, it is not, interacing with the _actua_ device status.
> > > > > >
> > > > > > In "SUSPEND" patch-5, you already asked this question. I
> > > > > > assume you asked
> > > > > again so that this series is complete.
> > > > > >
> > > > > > > For example,
> > > > > > > what happens if the driver wants to reset but the device is
> > > > > > > in stop mode? You told me it is addressed in your series but
> > > > > > > looks not. Once you try to describe that, you're actually
> > > > > > > try to connect states between the
> > > > > two state machines.
> > > > > > >
> > > > > > As listed in the definition of the stop mode, the device do
> > > > > > not act on the
> > > > > incoming writes, it only keep tracks of its internal device
> > > > > context change as part of this.
> > > > >
> > > > > So only the driver notification is allowed by not config write?
> > > > > What's the consideration for allowing driver notification?
> > > > >
> > > > Because for most practical purposes, peer device wants to queue
> > > > blk, net
> > > other requests and not do device configuration.
> > > >
> > > > Do you know any device configuration space which is RW?
> > > > For net and blk I recall it as RO?
> > >
> > > No it isn't. Pls look at the spec if you need to check that ;)
> > >
> > Ok. will check. But regardless, it is fine, because when STOP is done, config
> writes should not occur anyway.
> 
> 
> i don't see a statement like this but maybe i missed it.
>
I am missing it, will add.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]