[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration
On Fri, Nov 17, 2023 at 8:03âPM Parav Pandit <parav@nvidia.com> wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Friday, November 17, 2023 5:13 PM > > > > On Fri, Nov 17, 2023 at 11:20:14AM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Friday, November 17, 2023 4:41 PM > > > > > > > > On Fri, Nov 17, 2023 at 10:20:45AM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > Sent: Friday, November 17, 2023 3:38 PM > > > > > > > > > > > > On Wed, Nov 15, 2023 at 05:39:43PM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > > > Additionally, if hypervisor has put the trap on virtio > > > > > > > > > config, and because the memory device already has the > > > > > > > > > interface for virtio config, > > > > > > > > > > > > > > > > > > Hypervisor can directly write/read from the virtual config > > > > > > > > > to the member's > > > > > > > > config space, without going through the device context, right? > > > > > > > > > > > > > > > > If it can do it or it can choose to not. I don't see how it > > > > > > > > is related to the discussion here. > > > > > > > > > > > > > > > It is. I donât see a point of hypervisor not using the native > > > > > > > interface provided > > > > > > by the member device. > > > > > > > > > > > > So for example, it seems reasonable to a member supporting both > > > > > > existing pci register interface for compatibility and the future > > > > > > DMA based one for scale. In such a case, it seems possible that > > > > > > DMA will expose more features than pci. And then a hypervisor > > > > > > might decide to use > > > > that in preference to pci registers. > > > > > > > > > > We donât find it right to involve owner device for mediating at > > > > > current scale > > > > > > > > In this model, device will be its own owner. Should not be a problem. > > > > > > > I didnât understand above comment. > > > > We'd add a new group type "self". You can then send admin commands through > > VF itself not through PF. > > > How? The device is owned by the guest. FLR and device reset cannot send the admin command reliably. > > > > > > > > and to not break TDISP efforts in upcoming time by such design. > > > > > > > > Look you either stop mentioning TDISP as motivation or actually try > > > > to address it. Safe migration with TDISP is really hard. > > > But that is not an excuse to say that TDISP migration is not present, hence > > involve the owner device for config space access. > > > This is another hurdle added that further blocks us away from TDISP. > > > Hence, we donât want to take the route of involving owner device for any > > config access. > > > > This "blocks" is all just wild hunches. hypervisor controls some aspects of TDISP > > devices for sure - maybe we actually should use pci config space as that is > > generally hypervisor controlled. > Even bad to do hypercalls. > I showed you last time the role of the PCI config space snippet from the spec. > Do you see we are repeating the discussion again? > > > > > > > For example, your current patches are clearly broken for TDISP: > > > > owner can control queue state at any time making device modify > > > > memory in any way it wants. > > > > > > > When TDISP migration is needed, the admin device can be another TVM > > outside the HV scope. > > > Or an alternative would have device context encrypted not visible to HV at all. > > > > Maybe. Fact remains your patches do conflict with TDISP and you seem to be > > fine with it because you have a hunch you can fix it. But we can't do > > development based on your hunches. > > > We have different view. > My patches do not conflict with TDISP because TDISP has clear definition of not involving hypervisor for transport. > And that part is still preserved. > Delegating the migration to another TDISP or encrypting is yet to be defined. > And current patches will align to both the approaches in future. > > So you need to re-evaluate your judgment. > > > > > > Such encryption is not possible, with the trap+emulation method, where HV > > will have to decrypt the data coming over MMIO writes. > > > > I don't how what trap+emulation has to do with it. Do you refer to the shadow > > vq thing? > > The method proposed here does not hinder any TDISP direction. > > Without my proposal, do you have a method that does not involve hypervisor intervention for virtio common and device config space, cvq and shadow vq? > If so, I would like to hear that as well because that will align with TDISP. So this is what you said: 1) TDISP would not do mediation 2) registers doesn't scale This is exactly what transport virtqueue did. Isn't it? > > > I am guessing modern platforms with TDISP support are likely to also > > support dirty bit in the IOMMU. > > > It will be some day. Dirty bit is far more realistic than TDISP in the short term. > > > > > > > > And for future scale, having new SIOV interface makes more sense > > > > > which has > > > > its own direct interface to device. > > > > > > > > > > I finally captured all past discussions in form of a FAQ at [1]. > > > > > > > > > > [1] > > > > > https://docs.google.com/document/d/1Iyn-l3Nm0yls3pZaul4lZiVj8x1s73 > > > > > Ed6r > > > > > Osmn6LfXc/edit?usp=sharing > > > > > > > > Yea skimmed that, "Cons: None". Are you 100% sure? Anyway, > > > > discussion will take place on the mailing list please. > > > > > > We cannot keep discussing the register interface every week. > > > I remember we have discussed this many times already in following series. > > > > > > 1. legacy series How can this be supported in TDISP then? > > > 2. tvq v4 series > > > 3. dynamic vq creation series > > > 4. again during suspend series under tvq head 5. right now 6. May be > > > more that I forgot. > > > > > > I captured all the direction and options in the doc. One can refer when those > > questions arise there. > > > If we donât work cohesively same reasoning repetition does not help. > > > > It's still the same too, doc or no doc. You want to build a device without > > registers fine but don't force it down everyone's throat. > I donât see any compelling reason for inventing new method really. New requests/platforms come for sure, and virtio supports various transports. For example, there's a request to support PCI endpoint devices. > Nor continuing in register mode. Most virtio devices are implemented in software. And we have pure MMIO based transport now which is implemented in registers only. > Virtio already has VQ. > If CVQ is so problematic, one should put everything on registers and not run on double standards. I don't think there's anyone who says CVQ is problematic. > > I captured all the reasoning and thoughts. I donât have much to say in support of infinite register scale. > > People who wants to push SIOV does not show single performance reason on why SIOV to be done. > I have upstreamed SIOVs in Linux as SFs without PASID, and in all our scale tests, before the device chocks, the system chocks. > > So when someone pushes the SIOV series, I will be the first one interested in reading the performance numbers to proceed with patches. > > > And now with 8MBytes > > of on-device memory that's needed for migration and that's apparently fine I > > am even less interested in saving 256 bytes for config space. > > Again, not the right comparison. > When and how to use 256 matters. Do you know how much the config has grown in the past years since 1.0? Virtio should be implemented easily from: 1) software device to hardware device 2) embedded to server You can't say e.g migration is needed in all of the environments. Thanks > I havenât come across any device that prefers infinite register scale.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]