OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration


> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, November 21, 2023 10:55 AM
> 
> On Fri, Nov 17, 2023 at 8:03âPM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Friday, November 17, 2023 5:13 PM
> > >
> > > On Fri, Nov 17, 2023 at 11:20:14AM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Friday, November 17, 2023 4:41 PM
> > > > >
> > > > > On Fri, Nov 17, 2023 at 10:20:45AM +0000, Parav Pandit wrote:
> > > > > >
> > > > > >
> > > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > Sent: Friday, November 17, 2023 3:38 PM
> > > > > > >
> > > > > > > On Wed, Nov 15, 2023 at 05:39:43PM +0000, Parav Pandit wrote:
> > > > > > > > > >
> > > > > > > > > > Additionally, if hypervisor has put the trap on virtio
> > > > > > > > > > config, and because the memory device already has the
> > > > > > > > > > interface for virtio config,
> > > > > > > > > >
> > > > > > > > > > Hypervisor can directly write/read from the virtual
> > > > > > > > > > config to the member's
> > > > > > > > > config space, without going through the device context, right?
> > > > > > > > >
> > > > > > > > > If it can do it or it can choose to not. I don't see how
> > > > > > > > > it is related to the discussion here.
> > > > > > > > >
> > > > > > > > It is. I donât see a point of hypervisor not using the
> > > > > > > > native interface provided
> > > > > > > by the member device.
> > > > > > >
> > > > > > > So for example, it seems reasonable to a member supporting
> > > > > > > both existing pci register interface for compatibility and
> > > > > > > the future DMA based one for scale. In such a case, it seems
> > > > > > > possible that DMA will expose more features than pci. And
> > > > > > > then a hypervisor might decide to use
> > > > > that in preference to pci registers.
> > > > > >
> > > > > > We donât find it right to involve owner device for mediating
> > > > > > at current scale
> > > > >
> > > > > In this model, device will be its own owner. Should not be a problem.
> > > > >
> > > > I didnât understand above comment.
> > >
> > > We'd add a new group type "self". You can then send admin commands
> > > through VF itself not through PF.
> > >
> > How? The device is owned by the guest. FLR and device reset cannot send
> the admin command reliably.
> >
> > >
> > > > > > and to not break TDISP efforts in upcoming time by such design.
> > > > >
> > > > > Look you either stop mentioning TDISP as motivation or actually
> > > > > try to address it. Safe migration with TDISP is really hard.
> > > > But that is not an excuse to say that TDISP migration is not
> > > > present, hence
> > > involve the owner device for config space access.
> > > > This is another hurdle added that further blocks us away from TDISP.
> > > > Hence, we donât want to take the route of involving owner device
> > > > for any
> > > config access.
> > >
> > > This "blocks" is all just wild hunches. hypervisor controls some
> > > aspects of TDISP devices for sure - maybe we actually should use pci
> > > config space as that is generally hypervisor controlled.
> > Even bad to do hypercalls.
> > I showed you last time the role of the PCI config space snippet from the
> spec.
> > Do you see we are repeating the discussion again?
> >
> > >
> > > > > For example, your current patches are clearly broken for TDISP:
> > > > > owner can control queue state at any time making device modify
> > > > > memory in any way it wants.
> > > > >
> > > > When TDISP migration is needed, the admin device can be another
> > > > TVM
> > > outside the HV scope.
> > > > Or an alternative would have device context encrypted not visible to HV
> at all.
> > >
> > > Maybe. Fact remains your patches do conflict with TDISP and you seem
> > > to be fine with it because you have a hunch you can fix it. But we
> > > can't do development based on your hunches.
> > >
> > We have different view.
> > My patches do not conflict with TDISP because TDISP has clear definition of
> not involving hypervisor for transport.
> > And that part is still preserved.
> > Delegating the migration to another TDISP or encrypting is yet to be defined.
> > And current patches will align to both the approaches in future.
> >
> > So you need to re-evaluate your judgment.
> >
> > >
> > > > Such encryption is not possible, with the trap+emulation method,
> > > > where HV
> > > will have to decrypt the data coming over MMIO writes.
> > >
> > > I don't how what trap+emulation has to do with it. Do you refer to
> > > the shadow vq thing?
> >
> > The method proposed here does not hinder any TDISP direction.
> >
> > Without my proposal, do you have a method that does not involve
> hypervisor intervention for virtio common and device config space, cvq and
> shadow vq?
> > If so, I would like to hear that as well because that will align with TDISP.
> 
> So this is what you said:
> 
> 1) TDISP would not do mediation
> 2) registers doesn't scale
> 
> This is exactly what transport virtqueue did. Isn't it?
> 
No. 
CVQ is doing both of them currently uniformly across PF, VF.
Future SIOV will be able to this also.

> >
> > > I am guessing modern platforms with TDISP support are likely to also
> > > support dirty bit in the IOMMU.
> > >
> > It will be some day.
> 
> Dirty bit is far more realistic than TDISP in the short term.
> 
> >
> > >
> > > > > > And for future scale, having new SIOV interface makes more
> > > > > > sense which has
> > > > > its own direct interface to device.
> > > > > >
> > > > > > I finally captured all past discussions in form of a FAQ at [1].
> > > > > >
> > > > > > [1]
> > > > > > https://docs.google.com/document/d/1Iyn-l3Nm0yls3pZaul4lZiVj8x
> > > > > > 1s73
> > > > > > Ed6r
> > > > > > Osmn6LfXc/edit?usp=sharing
> > > > >
> > > > > Yea skimmed that, "Cons: None". Are you 100% sure? Anyway,
> > > > > discussion will take place on the mailing list please.
> > > >
> > > > We cannot keep discussing the register interface every week.
> > > > I remember we have discussed this many times already in following
> series.
> > > >
> > > > 1. legacy series
> 
> How can this be supported in TDISP then?
> 
> > > > 2. tvq v4 series
> > > > 3. dynamic vq creation series
> > > > 4. again during suspend series under tvq head 5. right now 6. May
> > > > be more that I forgot.
> > > >
> > > > I captured all the direction and options in the doc. One can refer
> > > > when those
> > > questions arise there.
> > > > If we donât work cohesively same reasoning repetition does not help.
> > >
> > > It's still the same too, doc or no doc. You want to build a device
> > > without registers fine but don't force it down everyone's throat.
> > I donât see any compelling reason for inventing new method really.
> 
> New requests/platforms come for sure, and virtio supports various transports.
> 
> For example, there's a request to support PCI endpoint devices.
What is PCI endpoint devices? Does it have a new device type?

> 
> > Nor continuing in register mode.
> 
> Most virtio devices are implemented in software. 
We see it differently in field and in virtio charter since 2021.

> And we have pure MMIO
> based transport now which is implemented in registers only.
> 
> > Virtio already has VQ.
> > If CVQ is so problematic, one should put everything on registers and not run
> on double standards.
> 
> I don't think there's anyone who says CVQ is problematic.
> 
Ok. than lets stop this endless debate.
Everyone is using CVQ, lets continue to use.

> >
> > I captured all the reasoning and thoughts. I donât have much to say in
> support of infinite register scale.
> >
> > People who wants to push SIOV does not show single performance reason
> on why SIOV to be done.
> > I have upstreamed SIOVs in Linux as SFs without PASID, and in all our scale
> tests, before the device chocks, the system chocks.
> >
> > So when someone pushes the SIOV series, I will be the first one interested in
> reading the performance numbers to proceed with patches.
> >
> > > And now with 8MBytes
> > > of on-device memory that's needed for migration and that's
> > > apparently fine I am even less interested in saving 256 bytes for config
> space.
> >
> > Again, not the right comparison.
> > When and how to use 256 matters.
> 
> Do you know how much the config has grown in the past years since 1.0?
> 
Very less and no point in deviating the design now anyway for device migration or otherwise.

> Virtio should be implemented easily from:
> 
And it is already there.

> 1) software device to hardware device
> 2) embedded to server
> 
> You can't say e.g migration is needed in all of the environments.
Which line in the patch said this?
I specifically asked to not build transport vq because efficiency is needed on the PFs too.

> 
> Thanks
> 
> > I havenât come across any device that prefers infinite register scale.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]