OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration


On Fri, Nov 17, 2023 at 11:09âPM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Nov 17, 2023 at 02:51:04PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Friday, November 17, 2023 7:24 PM
> > >
> > > On Fri, Nov 17, 2023 at 12:46:21PM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Friday, November 17, 2023 6:00 PM
> > > > > To: Parav Pandit <parav@nvidia.com>
> > > > >
> > > > > On Fri, Nov 17, 2023 at 12:02:49PM +0000, Parav Pandit wrote:
> > > > > >
> > > > > >
> > > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > Sent: Friday, November 17, 2023 5:13 PM
> > > > > > >
> > > > > > > On Fri, Nov 17, 2023 at 11:20:14AM +0000, Parav Pandit wrote:
> > > > > > > >
> > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > > Sent: Friday, November 17, 2023 4:41 PM
> > > > > > > > >
> > > > > > > > > On Fri, Nov 17, 2023 at 10:20:45AM +0000, Parav Pandit wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > > > > Sent: Friday, November 17, 2023 3:38 PM
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Nov 15, 2023 at 05:39:43PM +0000, Parav Pandit wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Additionally, if hypervisor has put the trap on
> > > > > > > > > > > > > > virtio config, and because the memory device
> > > > > > > > > > > > > > already has the interface for virtio config,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hypervisor can directly write/read from the
> > > > > > > > > > > > > > virtual config to the member's
> > > > > > > > > > > > > config space, without going through the device context, right?
> > > > > > > > > > > > >
> > > > > > > > > > > > > If it can do it or it can choose to not. I don't see
> > > > > > > > > > > > > how it is related to the discussion here.
> > > > > > > > > > > > >
> > > > > > > > > > > > It is. I donât see a point of hypervisor not using the
> > > > > > > > > > > > native interface provided
> > > > > > > > > > > by the member device.
> > > > > > > > > > >
> > > > > > > > > > > So for example, it seems reasonable to a member
> > > > > > > > > > > supporting both existing pci register interface for
> > > > > > > > > > > compatibility and the future DMA based one for scale. In
> > > > > > > > > > > such a case, it seems possible that DMA will expose more
> > > > > > > > > > > features than pci. And then a hypervisor might decide to
> > > > > > > > > > > use
> > > > > > > > > that in preference to pci registers.
> > > > > > > > > >
> > > > > > > > > > We donât find it right to involve owner device for
> > > > > > > > > > mediating at current scale
> > > > > > > > >
> > > > > > > > > In this model, device will be its own owner. Should not be a problem.
> > > > > > > > >
> > > > > > > > I didnât understand above comment.
> > > > > > >
> > > > > > > We'd add a new group type "self". You can then send admin
> > > > > > > commands through VF itself not through PF.
> > > > > > >
> > > > > > How? The device is owned by the guest. FLR and device reset cannot
> > > > > > send the
> > > > > admin command reliably.
> > > > >
> > > > > It's of the "it hurts when I do this - don't do this then" category.
> > > > >
> > > > it is donât do medication category, yes due all this weirdness that has been
> > > asked.
> > > >
> > > > >
> > > > > > >
> > > > > > > > > > and to not break TDISP efforts in upcoming time by such design.
> > > > > > > > >
> > > > > > > > > Look you either stop mentioning TDISP as motivation or
> > > > > > > > > actually try to address it. Safe migration with TDISP is really hard.
> > > > > > > > But that is not an excuse to say that TDISP migration is not
> > > > > > > > present, hence
> > > > > > > involve the owner device for config space access.
> > > > > > > > This is another hurdle added that further blocks us away from TDISP.
> > > > > > > > Hence, we donât want to take the route of involving owner
> > > > > > > > device for any
> > > > > > > config access.
> > > > > > >
> > > > > > > This "blocks" is all just wild hunches. hypervisor controls some
> > > > > > > aspects of TDISP devices for sure - maybe we actually should use
> > > > > > > pci config space as that is generally hypervisor controlled.
> > > > > > Even bad to do hypercalls.
> > > > > > I showed you last time the role of the PCI config space snippet from the
> > > spec.
> > > > >
> > > > > Yes I remember. This is just an example though. My point is maybe it
> > > > > is solvable maybe it is not.
> > > > >
> > > > > > Do you see we are repeating the discussion again?
> > > > >
> > > > > One of the reasons is that people bring up irrelevances. TDISP is
> > > > > important but has to be addressed or deferred not vaguely referred to.
> > > >
> > > > So lets continue to follow the current TDISP direction of not involving
> > > hypervisor for virtio common and device config.
> >
> > If you disagree to it, please speak now, so that we donât debate on this again in next 3 days.
> > Because this is the fundamental design considerations it relied on.
> > There is no point going forward if you want to disagree to it.
> > Other variants are fine, but other variants cannot be the only choice.
> >
> > > >
> > > > >
> > > > > > >
> > > > > > > > > For example, your current patches are clearly broken for TDISP:
> > > > > > > > > owner can control queue state at any time making device
> > > > > > > > > modify memory in any way it wants.
> > > > > > > > >
> > > > > > > > When TDISP migration is needed, the admin device can be
> > > > > > > > another TVM
> > > > > > > outside the HV scope.
> > > > > > > > Or an alternative would have device context encrypted not
> > > > > > > > visible to HV at
> > > > > all.
> > > > > > >
> > > > > > > Maybe. Fact remains your patches do conflict with TDISP and you
> > > > > > > seem to be fine with it because you have a hunch you can fix it.
> > > > > > > But we can't do development based on your hunches.
> > > > > > >
> > > > > > We have different view.
> > > > > > My patches do not conflict with TDISP because TDISP has clear
> > > > > > definition of
> > > > > not involving hypervisor for transport.
> > > > > > And that part is still preserved.
> > > > > > Delegating the migration to another TDISP or encrypting is yet to be
> > > defined.
> > > > > > And current patches will align to both the approaches in future.
> > > > > >
> > > > > > So you need to re-evaluate your judgment.
> > > > >
> > > > > If you like they do not "conflict".  But if used with TDISP they
> > > > > just make it insecure and thus completely worthless.  If hypervisor
> > > > > can change ring state to make device poke at random guest memory
> > > > > then it is game over and all the effort spent was security theater.
> > > > Not really, I proposed two options.
> > > > 1. delegate the task of LM to the TVM. (proposed by two cpu vendors).
> > > > In this case all the infra we build here, just works fine.
> > >
> > > I think modification will be needed: currently commands are sent through the
> > > PF, and that is under hypervisor control.
> > > You should not assign PF to TVM.

That's the point. And that's why it keeps people confused to believe
the current PF/adminq can work in the TDISP.

> > Yes, an admin virtio function will be there which will do the admin commands listed.
>
> So it can't be PF, so at least we need a new group type.
> I am inclined to then say, operate it through VF itself.

So it exactly matches the idea of transport virtqueue (a per VF/SF one).

But it still requires a PCI part to bootstrap.

>
>
> > >
> > > > It also does not require any hypervisor mediation for control plane.
> > > >
> > > > 2. Encrypt the owner device workload to be not seen by hypervisor
> > > >
> > > > Both methods does not affect the current direction.
> > > >
> > > > But if we force trap+emulation, it is 100% broken for TDISP.
> > > > And I would not promote that.
> > > >
> > > > > But you know this, don't you? This is why you mentioned encrypting device.
> > > > > Maybe that works. It just does not work *as is*.
> > > > It works as_is. But current infrastructure does not block the future work.
> > > >
> > > > >
> > > > >
> > > > > > >
> > > > > > > > Such encryption is not possible, with the trap+emulation
> > > > > > > > method, where HV
> > > > > > > will have to decrypt the data coming over MMIO writes.
> > > > > > >
> > > > > > > I don't how what trap+emulation has to do with it. Do you refer
> > > > > > > to the shadow vq thing?
> > > > > >
> > > > > > The method proposed here does not hinder any TDISP direction.
> > > > >
> > > > > direction? No, why would it. we can always add more commands that
> > > > > are safe for TDISP. commands you propose here are unsafe for TDISP.
> > > > >
> > > > > > Without my proposal, do you have a method that does not involve
> > > > > > hypervisor
> > > > > intervention for virtio common and device config space, cvq and shadow
> > > vq?
> > > > > > If so, I would like to hear that as well because that will align with TDISP.
> > > > >
> > > > > I really did not give it much thought.  I suspect for TDISP it just
> > > > > might be cleaner to have guest agent migrate device. Certainly removes all
> > > the messy questions.
> > > > > That, to me impliest there needs to be a way to send migration
> > > > > commands through VF itself. Does this "involve hypervisor
> > > > > intervention"? No one should care I think.
> > > > Too far of the future to envision. May be yes. When such platform is
> > > > built, for sure whoever migrates need migrate its device side too.
> > > > Some knowledge of migration driver is needed.
> > >
> > > So TDISP migration is so far in the future you do not need to bother about it.
> > > Fine. Then don't bring it up pls.
> > >
> > As long as we are aligned to the requirement that a virtio member device is mapped to the guest VM without mediating the virtio interface, I am good.
> > Again, other variants are fine, but above listed mapped variant is the minimum variant needed.
>
> I think it's worth supporting this. I wouldn't call this minimum there
> are other approaches.  And I am not so sure it's worth trying
> to support this in all kind of systems such as IOMMU without
> dirty bit support. If some old systems will need mediation,
> this is kind of like legacy interface. Not a big deal.
>

+1

Thanks



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]