[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] RE: [PATCH V2 3/6] virtio: dont reset vqs when SUSPEND
On Fri, Nov 17, 2023 at 06:13:50PM +0800, Zhu, Lingshan wrote: > > > On 11/16/2023 8:09 PM, Michael S. Tsirkin wrote: > > On Thu, Nov 16, 2023 at 06:09:38PM +0800, Zhu, Lingshan wrote: > > > On 11/16/2023 1:35 AM, Parav Pandit wrote: > > From: Zhu, Lingshan <lingshan.zhu@intel.com> > Sent: Monday, November 13, 2023 2:53 PM > > On 11/10/2023 2:31 PM, Parav Pandit wrote: > > From: Zhu, Lingshan <lingshan.zhu@intel.com> > Sent: Friday, November 10, 2023 11:52 AM > > On 11/9/2023 6:15 PM, Parav Pandit wrote: > > From: Zhu, Lingshan <lingshan.zhu@intel.com> > Sent: Thursday, November 9, 2023 3:28 PM > > On 11/9/2023 1:46 AM, Michael S. Tsirkin wrote: > > On Tue, Nov 07, 2023 at 05:27:23PM +0800, Zhu, Lingshan wrote: > > On 11/6/2023 5:49 PM, Michael S. Tsirkin wrote: > > On Fri, Nov 03, 2023 at 06:34:34PM +0800, Zhu Lingshan wrote: > > When SUSPEND is set, device states and virtqueue states should > be stablized, therefore the driver should not reset vqs when > SUSPEND is set in device status. > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > --- > content.tex | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/content.tex b/content.tex index bcc9d4b..060b5c2 > 100644 > --- a/content.tex > +++ b/content.tex > @@ -444,6 +444,9 @@ \subsubsection{Virtqueue > Reset}\label{sec:Basic > > Facilities of a Virtio Device / > > The device MUST reset any state of a virtqueue to the default > > state, > > including the available state and the used state. > +If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set in > +\field{device status}, the driver SHOULD NOT reset any virtqueues. > + > \drivernormative{\paragraph}{Virtqueue Reset}{Basic > Facilities of a > > Virtio Device / Virtqueues / Virtqueue Reset / Virtqueue Reset} > > After the driver tells the device to reset a queue, the > driver MUST verify that > > Seems somewhat arbitrary and breaks the claim that the feature > is orthogonal and can have uses besides migration. > > when suspended, the device is frozen. > The driver is aware of this process and so should not reset the vqs I > > think. > > Again that is only true because you want to use it for migration. > But then you can't claim it's a generic facility. > > I don't get it. The device status is a basic facility. > > We need to SUSPEND the device by setting SUSPEND bit, to stabilize > the device states for migration. > > Is the PCI's PM time not enough to suspend the device? > For large device I could imagine it could be short. > > As you see, PCI PM, so this is a layer violation, virtio should be > self contained, > > If you think it is layer violation, than suspend bit for sure is not needed. PCI > > PM interface should suspend/resume the device on D0<->D3 state transitions. > Doesn't make sense logically, because it is layer violation, so you want it to be > worse? For example, virito writes 0 to device status to reset a device, not by PCI. > > All these layer violation thing is just abstract to me. > Your argument contradicts with your fellow author and yourself. > > I don't see how, we keep telling you virtio should be self contained, and > suspend by PCI PM is a > layer volition, this is a fact, right? > > Not really. Look at the charter - when available we should use platform > capabilities because it makes it easier to write drivers. > > I think that is transport specific implementation, for example pci common cfg. > > > > > I donât want to make it worse. > If you think its layer violation, just depend on the PCI PM, no need to include new suspend bit. > > Again, virtio should be self-contained, not layer volited, for example, we > reset virito devices > by writing 0 to device status, not by PCI FLR. > > There are some advantage to doing it like this, e.g. one does not need > to save and restore config space. What are advatages of suspend via this > bit? > > suspend a device by the device status is the same as how we enable a virito > device. > > Doing this by PCI is clearly a layer volition, and does not work for other > transports. > > > > and what about MMIO and CCW? > > They have largely lacked the richness of PCI transport. So those transport > > needs to evolve. > I am not sure CCW and MMIO maintainers want to hear this. > > Otherwise, PCI offers rich transport facilities compared to MMIO, hence, it will > > continue wider use. > you know this SUSPEND bit work fine on all transport, right? Because > device_status is transport independent. > > I want to emphasize that I am not against the suspend bit as long as it is guest driver controlled without interfering the device migration flow (like rest of the state). > > When migrate a device, it is the host who suspends the device. The reason is > the live migration process should be transparent to > the guest, so we should suspend the guest first, then suspend the device(by > host). > > The practical reason for suspending functionality under guest control is, that resuming/suspending the large device can take time. > So let it be in guest driver control. No need to muddy with device migration flow. > > The time cost is reasonable in O(N) no matter how you suspend/resume the > device. > > Very much depends. Big O notation can be misleading. If you have to > repeat an operation 1000 times that's 1000 * N and suddenly you are > going from milliseconds to seconds. > > I mean enable 100 queues cost more time then enable 1 vq no matter > how we enable it. that is O(N) Depends on what "that" is. Number of VM exits does not have to be O(N), you can pass these 100 queues in memory. > > > > This should be a basic facility. > > Other transport can also offer like PCI. > > Do you want to work for these transport? Implementing the new features as > PCI? > > Not presently as PCI as more features than rest of the two. > What I read about ccw is: " S/390 based virtual machines support neither PCI nor MMIO". > > And I also read, "The IBM System/390 is a discontinued mainframe product family implementing". > > So I donât know who needs to extend ccw. > And if one needs, those maintainers will extend it to match to PCI standard. > > So these features are even not planned, so don't depend on them. > > But again can one suspend ccw device? If you are adding this feature and > claiming it's supported for all transports you better find out > what does it do. > > I am not an expert on CCW, anything block we suspend a CCW device by this bit? I don't think CCW supports suspend at all. > This seems only controlled by the device itself. > And? What it the point of suspending only the device if rest of system is still going? > > > In that case if there is suspend the device available, it will be > used by the > > guest driver itself, hypervisor wouldnât know about it when those > registers are not trapped. > > So we need two ways to suspend. > One is guest visible, and guest controlled. > Second is hypervisor control to fulfill the device migration needs. > > The guest can eve reset the device. > > So if you can please take a look if the proposed admin command to > > freeze/stop mode can be used in the emulated register case or not. > > It helps to have the suspend bit in guest control as well > with/without > > emulation mode. > Parav, please believe I have read your series, I didn't comment there > because I want to avoid further conflicts/debating, we have done these > > enough. > > I believe the series posted in v3 can support vdpa use case as well. > So I will progress to post v4. > > > As explained before, freeze/stop the device by PCI is a layer violation. > > I am afraid, we have different vision. > I donât see any layer violation. > Suspend is enough in the PCI PM. > Our vision is more aligned with rest of the hypervisor knobs that owns the > > migration framework. > I think I have explained, virito builds on other transport and it should be self- > contained, so far so good. > > Virtio without any transport binding is just blank paper discussion. > > virtio is built on some transports, but not bind to any. > > Binding is an OS specific thing, but e.g. under Linux transport drivers bind to > devices then virtio drivers bind to virtio bus. No binding -> nothing > works. > > I think general facilities are better not only work on a specific transport > But platform facilities are even better we don't need to work on them at all. > > And device status can be pass-through(without emulation, just map it > to > guest) to the guest or trapped(trap and emulate by the hypervisor, > for example set_status in vDPA). > > When it is pass-through, it is controlled by the guest, so for example, if the > > guest resets the device, hypervisor has lost the control of migration context etc. > > Hence, hypervisor needs a channel which is not guest owned. > > Same channel can work when trap+emulation is done. > > It is the guest owns the device, it can reset the device, once reset, the device > context are cleared. > > Hypervisor do not have the ability to read/write the device context. It lost the channel as hypervisor is not involved in trap+emulation. > So it is not helpful in one use case. > > Admin commands can work even with trap+emulation mode. > > What is missing, that should be added? > > as explained above, when live migration, the guest should be suspended > first, at this point, > the host owns the device, it has access to the device. > > Where do you say this in the spec patch? > > VM live migration is not in this spec. Then it should be. > If we suspend the device first, then the guest may detect IO errors. > That's bad. So you need to tell driver what not to do so as not to get errors. > > > This can also be used for debugging I think. > > As Michael listed, a dedicated debug interface is usually more > useful instead > > of in-band. > re-using another facility without extra efforts is not a bad thing anyway. > > I just donât see how a suspend bit some debug feature. > Almost everything with that regard is a debug feature to me. > > suspend then check the device states? > > You already suspended the device, so device state is already changed. > All debug information is changed, so not useful now. > > When suspended, the device should keep and stabilize its device states, > at least in my series it should behave like this. > > That's vague. What does it mean exactly and what happens if > some external event causes state change? > > it is suspended, somehow like powered-down, so it should not > respond to the events until resume. "somehow" is too vague for the spec. -- MST
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]