[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration
> From: Jason Wang <jasowang@redhat.com> > Sent: Wednesday, November 22, 2023 9:46 AM > > On Wed, Nov 22, 2023 at 12:27âAM Parav Pandit <parav@nvidia.com> wrote: > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > Sent: Tuesday, November 21, 2023 10:15 AM > > > > > > On Fri, Nov 17, 2023 at 11:09âPM Michael S. Tsirkin <mst@redhat.com> > > > wrote: > > > > > > > > On Fri, Nov 17, 2023 at 02:51:04PM +0000, Parav Pandit wrote: > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > Sent: Friday, November 17, 2023 7:24 PM > > > > > > > > > > > > On Fri, Nov 17, 2023 at 12:46:21PM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > > Sent: Friday, November 17, 2023 6:00 PM > > > > > > > > To: Parav Pandit <parav@nvidia.com> > > > > > > > > > > > > > > > > On Fri, Nov 17, 2023 at 12:02:49PM +0000, Parav Pandit wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > > > > Sent: Friday, November 17, 2023 5:13 PM > > > > > > > > > > > > > > > > > > > > On Fri, Nov 17, 2023 at 11:20:14AM +0000, Parav Pandit > wrote: > > > > > > > > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > > > > > > Sent: Friday, November 17, 2023 4:41 PM > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 17, 2023 at 10:20:45AM +0000, Parav > > > > > > > > > > > > Pandit > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > > > > > > > > > Sent: Friday, November 17, 2023 3:38 PM > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 15, 2023 at 05:39:43PM +0000, > > > > > > > > > > > > > > Parav Pandit > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Additionally, if hypervisor has put the > > > > > > > > > > > > > > > > > trap on virtio config, and because the > > > > > > > > > > > > > > > > > memory device already has the interface > > > > > > > > > > > > > > > > > for virtio config, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hypervisor can directly write/read from > > > > > > > > > > > > > > > > > the virtual config to the member's > > > > > > > > > > > > > > > > config space, without going through the > > > > > > > > > > > > > > > > device context, > > > right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If it can do it or it can choose to not. I > > > > > > > > > > > > > > > > don't see how it is related to the discussion here. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It is. I donât see a point of hypervisor not > > > > > > > > > > > > > > > using the native interface provided > > > > > > > > > > > > > > by the member device. > > > > > > > > > > > > > > > > > > > > > > > > > > > > So for example, it seems reasonable to a > > > > > > > > > > > > > > member supporting both existing pci register > > > > > > > > > > > > > > interface for compatibility and the future DMA > > > > > > > > > > > > > > based one for scale. In such a case, it seems > > > > > > > > > > > > > > possible that DMA will expose more features > > > > > > > > > > > > > > than pci. And then a hypervisor might decide > > > > > > > > > > > > > > to use > > > > > > > > > > > > that in preference to pci registers. > > > > > > > > > > > > > > > > > > > > > > > > > > We donât find it right to involve owner device > > > > > > > > > > > > > for mediating at current scale > > > > > > > > > > > > > > > > > > > > > > > > In this model, device will be its own owner. > > > > > > > > > > > > Should not be a > > > problem. > > > > > > > > > > > > > > > > > > > > > > > I didnât understand above comment. > > > > > > > > > > > > > > > > > > > > We'd add a new group type "self". You can then send > > > > > > > > > > admin commands through VF itself not through PF. > > > > > > > > > > > > > > > > > > > How? The device is owned by the guest. FLR and device > > > > > > > > > reset cannot send the > > > > > > > > admin command reliably. > > > > > > > > > > > > > > > > It's of the "it hurts when I do this - don't do this then" category. > > > > > > > > > > > > > > > it is donât do medication category, yes due all this > > > > > > > weirdness that has been > > > > > > asked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and to not break TDISP efforts in upcoming time > > > > > > > > > > > > > by such > > > design. > > > > > > > > > > > > > > > > > > > > > > > > Look you either stop mentioning TDISP as > > > > > > > > > > > > motivation or actually try to address it. Safe > > > > > > > > > > > > migration with TDISP is really > > > hard. > > > > > > > > > > > But that is not an excuse to say that TDISP > > > > > > > > > > > migration is not present, hence > > > > > > > > > > involve the owner device for config space access. > > > > > > > > > > > This is another hurdle added that further blocks us > > > > > > > > > > > away from > > > TDISP. > > > > > > > > > > > Hence, we donât want to take the route of involving > > > > > > > > > > > owner device for any > > > > > > > > > > config access. > > > > > > > > > > > > > > > > > > > > This "blocks" is all just wild hunches. hypervisor > > > > > > > > > > controls some aspects of TDISP devices for sure - > > > > > > > > > > maybe we actually should use pci config space as that > > > > > > > > > > is generally hypervisor > > > controlled. > > > > > > > > > Even bad to do hypercalls. > > > > > > > > > I showed you last time the role of the PCI config space > > > > > > > > > snippet from the > > > > > > spec. > > > > > > > > > > > > > > > > Yes I remember. This is just an example though. My point > > > > > > > > is maybe it is solvable maybe it is not. > > > > > > > > > > > > > > > > > Do you see we are repeating the discussion again? > > > > > > > > > > > > > > > > One of the reasons is that people bring up irrelevances. > > > > > > > > TDISP is important but has to be addressed or deferred not > > > > > > > > vaguely referred > > > to. > > > > > > > > > > > > > > So lets continue to follow the current TDISP direction of > > > > > > > not involving > > > > > > hypervisor for virtio common and device config. > > > > > > > > > > If you disagree to it, please speak now, so that we donât debate > > > > > on this > > > again in next 3 days. > > > > > Because this is the fundamental design considerations it relied on. > > > > > There is no point going forward if you want to disagree to it. > > > > > Other variants are fine, but other variants cannot be the only choice. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For example, your current patches are clearly broken for > TDISP: > > > > > > > > > > > > owner can control queue state at any time making > > > > > > > > > > > > device modify memory in any way it wants. > > > > > > > > > > > > > > > > > > > > > > > When TDISP migration is needed, the admin device can > > > > > > > > > > > be another TVM > > > > > > > > > > outside the HV scope. > > > > > > > > > > > Or an alternative would have device context > > > > > > > > > > > encrypted not visible to HV at > > > > > > > > all. > > > > > > > > > > > > > > > > > > > > Maybe. Fact remains your patches do conflict with > > > > > > > > > > TDISP and you seem to be fine with it because you have > > > > > > > > > > a hunch you can > > > fix it. > > > > > > > > > > But we can't do development based on your hunches. > > > > > > > > > > > > > > > > > > > We have different view. > > > > > > > > > My patches do not conflict with TDISP because TDISP has > > > > > > > > > clear definition of > > > > > > > > not involving hypervisor for transport. > > > > > > > > > And that part is still preserved. > > > > > > > > > Delegating the migration to another TDISP or encrypting > > > > > > > > > is yet to be > > > > > > defined. > > > > > > > > > And current patches will align to both the approaches in > future. > > > > > > > > > > > > > > > > > > So you need to re-evaluate your judgment. > > > > > > > > > > > > > > > > If you like they do not "conflict". But if used with > > > > > > > > TDISP they just make it insecure and thus completely > > > > > > > > worthless. If hypervisor can change ring state to make > > > > > > > > device poke at random guest memory then it is game over > > > > > > > > and all the effort spent was > > > security theater. > > > > > > > Not really, I proposed two options. > > > > > > > 1. delegate the task of LM to the TVM. (proposed by two cpu > vendors). > > > > > > > In this case all the infra we build here, just works fine. > > > > > > > > > > > > I think modification will be needed: currently commands are > > > > > > sent through the PF, and that is under hypervisor control. > > > > > > You should not assign PF to TVM. > > > > > > That's the point. And that's why it keeps people confused to believe > > > the current PF/adminq can work in the TDISP. > > > > > There is no confusion. > > No, when LingShan points out the conflict, you just told us it will be > addressed in the future. > Current proposal does not punch the hole in the TDISP TVM interface. > And after Michael pointed it out again, you agree than adminq can not be > part of PF in this context. > As explained when TDISP for device migration evolves, there will be other trusted entity which will do it. > And you miss the fact that admin virtqueue today can't be used to manage > the owner. > There is no such need for device migration. > > The admin queue interface ensures first step that TDISP interface is > dedicated to guest as today. > > There is no bifurcation added on the VF that needs extra mediation. > > > > > > > Yes, an admin virtio function will be there which will do the > > > > > admin > > > commands listed. > > > > > > > > So it can't be PF, so at least we need a new group type. > > > > I am inclined to then say, operate it through VF itself. > > > > > > So it exactly matches the idea of transport virtqueue (a per VF/SF one). > > > > > There is no need for transport virtqueue for VF as VF device has same > uniform principle as PF. > > I don't understand here. I have explained that you have invented a function > duplication of transport virtqueue. > No. we havenât. It is explained in other thread that functionality is different. > > If you want transport vq, please have it on the PF too. > > Nothing prevents this, actually transport virtqueue start from this. > > > And that also is not needed because there is already CVQ. > > I don't see why you keep mentioning CVQ. I don't see anyone that says > transport virtqueue is going to replace CVQ. > I explained everything in the doc. This is the Nth time being repeated... > > > > > But it still requires a PCI part to bootstrap. > > > > > > > > > > > > > > > > > > > > > > > > It also does not require any hypervisor mediation for control > plane. > > > > > > > > > > > > > > 2. Encrypt the owner device workload to be not seen by > > > > > > > hypervisor > > > > > > > > > > > > > > Both methods does not affect the current direction. > > > > > > > > > > > > > > But if we force trap+emulation, it is 100% broken for TDISP. > > > > > > > And I would not promote that. > > > > > > > > > > > > > > > But you know this, don't you? This is why you mentioned > > > > > > > > encrypting > > > device. > > > > > > > > Maybe that works. It just does not work *as is*. > > > > > > > It works as_is. But current infrastructure does not block > > > > > > > the future > > > work. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Such encryption is not possible, with the > > > > > > > > > > > trap+emulation method, where HV > > > > > > > > > > will have to decrypt the data coming over MMIO writes. > > > > > > > > > > > > > > > > > > > > I don't how what trap+emulation has to do with it. Do > > > > > > > > > > you refer to the shadow vq thing? > > > > > > > > > > > > > > > > > > The method proposed here does not hinder any TDISP > direction. > > > > > > > > > > > > > > > > direction? No, why would it. we can always add more > > > > > > > > commands that are safe for TDISP. commands you propose > > > > > > > > here are unsafe for > > > TDISP. > > > > > > > > > > > > > > > > > Without my proposal, do you have a method that does not > > > > > > > > > involve hypervisor > > > > > > > > intervention for virtio common and device config space, > > > > > > > > cvq and shadow > > > > > > vq? > > > > > > > > > If so, I would like to hear that as well because that > > > > > > > > > will align with > > > TDISP. > > > > > > > > > > > > > > > > I really did not give it much thought. I suspect for > > > > > > > > TDISP it just might be cleaner to have guest agent migrate > device. > > > > > > > > Certainly removes all > > > > > > the messy questions. > > > > > > > > That, to me impliest there needs to be a way to send > > > > > > > > migration commands through VF itself. Does this "involve > > > > > > > > hypervisor intervention"? No one should care I think. > > > > > > > Too far of the future to envision. May be yes. When such > > > > > > > platform is built, for sure whoever migrates need migrate > > > > > > > its device side > > > too. > > > > > > > Some knowledge of migration driver is needed. > > > > > > > > > > > > So TDISP migration is so far in the future you do not need to > > > > > > bother > > > about it. > > > > > > Fine. Then don't bring it up pls. > > > > > > > > > > > As long as we are aligned to the requirement that a virtio > > > > > member device is > > > mapped to the guest VM without mediating the virtio interface, I am > good. > > > > > Again, other variants are fine, but above listed mapped variant > > > > > is the > > > minimum variant needed. > > > > > > > > I think it's worth supporting this. I wouldn't call this minimum > > > > there are other approaches. And I am not so sure it's worth > > > > trying to support this in all kind of systems such as IOMMU > > > > without dirty bit support. If some old systems will need > > > > mediation, this is kind of like legacy interface. Not a big deal. > > > > > > > > > > +1 > > > > There are users with the recent cpus that may not have the IOMMU dirty > page tracking support. > > So I donât fully agree. > > There are setups that don't have SR-IOV or even PCI. > Which are those? Please expand those transports that need it. > Let's have a unified standard.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]