[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration
On Fri, Nov 17, 2023 at 11:09âPM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Nov 17, 2023 at 02:51:04PM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Friday, November 17, 2023 7:24 PM > > > > > > On Fri, Nov 17, 2023 at 12:46:21PM +0000, Parav Pandit wrote: > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > Sent: Friday, November 17, 2023 6:00 PM > > > > > To: Parav Pandit <parav@nvidia.com> > > > > > > > > > > On Fri, Nov 17, 2023 at 12:02:49PM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > Sent: Friday, November 17, 2023 5:13 PM > > > > > > > > > > > > > > On Fri, Nov 17, 2023 at 11:20:14AM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > > > Sent: Friday, November 17, 2023 4:41 PM > > > > > > > > > > > > > > > > > > On Fri, Nov 17, 2023 at 10:20:45AM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > > > > > Sent: Friday, November 17, 2023 3:38 PM > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 15, 2023 at 05:39:43PM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Additionally, if hypervisor has put the trap on > > > > > > > > > > > > > > virtio config, and because the memory device > > > > > > > > > > > > > > already has the interface for virtio config, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hypervisor can directly write/read from the > > > > > > > > > > > > > > virtual config to the member's > > > > > > > > > > > > > config space, without going through the device context, right? > > > > > > > > > > > > > > > > > > > > > > > > > > If it can do it or it can choose to not. I don't see > > > > > > > > > > > > > how it is related to the discussion here. > > > > > > > > > > > > > > > > > > > > > > > > > It is. I donât see a point of hypervisor not using the > > > > > > > > > > > > native interface provided > > > > > > > > > > > by the member device. > > > > > > > > > > > > > > > > > > > > > > So for example, it seems reasonable to a member > > > > > > > > > > > supporting both existing pci register interface for > > > > > > > > > > > compatibility and the future DMA based one for scale. In > > > > > > > > > > > such a case, it seems possible that DMA will expose more > > > > > > > > > > > features than pci. And then a hypervisor might decide to > > > > > > > > > > > use > > > > > > > > > that in preference to pci registers. > > > > > > > > > > > > > > > > > > > > We donât find it right to involve owner device for > > > > > > > > > > mediating at current scale > > > > > > > > > > > > > > > > > > In this model, device will be its own owner. Should not be a problem. > > > > > > > > > > > > > > > > > I didnât understand above comment. > > > > > > > > > > > > > > We'd add a new group type "self". You can then send admin > > > > > > > commands through VF itself not through PF. > > > > > > > > > > > > > How? The device is owned by the guest. FLR and device reset cannot > > > > > > send the > > > > > admin command reliably. > > > > > > > > > > It's of the "it hurts when I do this - don't do this then" category. > > > > > > > > > it is donât do medication category, yes due all this weirdness that has been > > > asked. > > > > > > > > > > > > > > > > > > > > > > > > > > and to not break TDISP efforts in upcoming time by such design. > > > > > > > > > > > > > > > > > > Look you either stop mentioning TDISP as motivation or > > > > > > > > > actually try to address it. Safe migration with TDISP is really hard. > > > > > > > > But that is not an excuse to say that TDISP migration is not > > > > > > > > present, hence > > > > > > > involve the owner device for config space access. > > > > > > > > This is another hurdle added that further blocks us away from TDISP. > > > > > > > > Hence, we donât want to take the route of involving owner > > > > > > > > device for any > > > > > > > config access. > > > > > > > > > > > > > > This "blocks" is all just wild hunches. hypervisor controls some > > > > > > > aspects of TDISP devices for sure - maybe we actually should use > > > > > > > pci config space as that is generally hypervisor controlled. > > > > > > Even bad to do hypercalls. > > > > > > I showed you last time the role of the PCI config space snippet from the > > > spec. > > > > > > > > > > Yes I remember. This is just an example though. My point is maybe it > > > > > is solvable maybe it is not. > > > > > > > > > > > Do you see we are repeating the discussion again? > > > > > > > > > > One of the reasons is that people bring up irrelevances. TDISP is > > > > > important but has to be addressed or deferred not vaguely referred to. > > > > > > > > So lets continue to follow the current TDISP direction of not involving > > > hypervisor for virtio common and device config. > > > > If you disagree to it, please speak now, so that we donât debate on this again in next 3 days. > > Because this is the fundamental design considerations it relied on. > > There is no point going forward if you want to disagree to it. > > Other variants are fine, but other variants cannot be the only choice. > > > > > > > > > > > > > > > > > > > > > > > > > > > For example, your current patches are clearly broken for TDISP: > > > > > > > > > owner can control queue state at any time making device > > > > > > > > > modify memory in any way it wants. > > > > > > > > > > > > > > > > > When TDISP migration is needed, the admin device can be > > > > > > > > another TVM > > > > > > > outside the HV scope. > > > > > > > > Or an alternative would have device context encrypted not > > > > > > > > visible to HV at > > > > > all. > > > > > > > > > > > > > > Maybe. Fact remains your patches do conflict with TDISP and you > > > > > > > seem to be fine with it because you have a hunch you can fix it. > > > > > > > But we can't do development based on your hunches. > > > > > > > > > > > > > We have different view. > > > > > > My patches do not conflict with TDISP because TDISP has clear > > > > > > definition of > > > > > not involving hypervisor for transport. > > > > > > And that part is still preserved. > > > > > > Delegating the migration to another TDISP or encrypting is yet to be > > > defined. > > > > > > And current patches will align to both the approaches in future. > > > > > > > > > > > > So you need to re-evaluate your judgment. > > > > > > > > > > If you like they do not "conflict". But if used with TDISP they > > > > > just make it insecure and thus completely worthless. If hypervisor > > > > > can change ring state to make device poke at random guest memory > > > > > then it is game over and all the effort spent was security theater. > > > > Not really, I proposed two options. > > > > 1. delegate the task of LM to the TVM. (proposed by two cpu vendors). > > > > In this case all the infra we build here, just works fine. > > > > > > I think modification will be needed: currently commands are sent through the > > > PF, and that is under hypervisor control. > > > You should not assign PF to TVM. That's the point. And that's why it keeps people confused to believe the current PF/adminq can work in the TDISP. > > Yes, an admin virtio function will be there which will do the admin commands listed. > > So it can't be PF, so at least we need a new group type. > I am inclined to then say, operate it through VF itself. So it exactly matches the idea of transport virtqueue (a per VF/SF one). But it still requires a PCI part to bootstrap. > > > > > > > > > It also does not require any hypervisor mediation for control plane. > > > > > > > > 2. Encrypt the owner device workload to be not seen by hypervisor > > > > > > > > Both methods does not affect the current direction. > > > > > > > > But if we force trap+emulation, it is 100% broken for TDISP. > > > > And I would not promote that. > > > > > > > > > But you know this, don't you? This is why you mentioned encrypting device. > > > > > Maybe that works. It just does not work *as is*. > > > > It works as_is. But current infrastructure does not block the future work. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Such encryption is not possible, with the trap+emulation > > > > > > > > method, where HV > > > > > > > will have to decrypt the data coming over MMIO writes. > > > > > > > > > > > > > > I don't how what trap+emulation has to do with it. Do you refer > > > > > > > to the shadow vq thing? > > > > > > > > > > > > The method proposed here does not hinder any TDISP direction. > > > > > > > > > > direction? No, why would it. we can always add more commands that > > > > > are safe for TDISP. commands you propose here are unsafe for TDISP. > > > > > > > > > > > Without my proposal, do you have a method that does not involve > > > > > > hypervisor > > > > > intervention for virtio common and device config space, cvq and shadow > > > vq? > > > > > > If so, I would like to hear that as well because that will align with TDISP. > > > > > > > > > > I really did not give it much thought. I suspect for TDISP it just > > > > > might be cleaner to have guest agent migrate device. Certainly removes all > > > the messy questions. > > > > > That, to me impliest there needs to be a way to send migration > > > > > commands through VF itself. Does this "involve hypervisor > > > > > intervention"? No one should care I think. > > > > Too far of the future to envision. May be yes. When such platform is > > > > built, for sure whoever migrates need migrate its device side too. > > > > Some knowledge of migration driver is needed. > > > > > > So TDISP migration is so far in the future you do not need to bother about it. > > > Fine. Then don't bring it up pls. > > > > > As long as we are aligned to the requirement that a virtio member device is mapped to the guest VM without mediating the virtio interface, I am good. > > Again, other variants are fine, but above listed mapped variant is the minimum variant needed. > > I think it's worth supporting this. I wouldn't call this minimum there > are other approaches. And I am not so sure it's worth trying > to support this in all kind of systems such as IOMMU without > dirty bit support. If some old systems will need mediation, > this is kind of like legacy interface. Not a big deal. > +1 Thanks
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]