OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration



> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, October 17, 2023 7:41 AM
> 
> On Fri, Oct 13, 2023 at 2:40âPM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Friday, October 13, 2023 6:48 AM
> > >
> > > On Thu, Oct 12, 2023 at 7:37âPM Parav Pandit <parav@nvidia.com>
> > > wrote:>
> > > > As Michael said, software based nesting is used..
> > >
> > > I've pointed out in another thread when hardware has less
> > > abstraction level than nesting, trap/emulation is a must.
> > >
> > > > See if actual hw based devices can implement it or not. Many
> > > > components of
> > > cpu cannot do N level nesting either, but may be virtio can.
> > > > I donât know how yet.
> > >
> > > I would not repeat the lessons given by Gerald J. Popek and Robert P.
> > > Goldberg[1] in 1976, but I think you miss a lot of fundamental
> > > things in the methodology of virtualization.
> > Weekend is coming. I will read it.
> >
> > > For example, nesting is a very important criteria to examine whether
> > > an architecture is well designed for virtualization.
> > >
> >
> > In my reading of a leading OS vendor documentation, I leant that OS vendor
> do not recommend nested virtualization for production at [1].
> > Snippet:
> > "In addition, Red Hat does not recommend using nested virtualization in
> production user environments, due to various limitations in functionality.
> Instead, nested virtualization is primarily intended for development and testing
> scenarios."
> >
> > [1]
> > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux
> > /8/html/configuring_and_managing_virtualization/creating-nested-virtua
> > l-machines_configuring-and-managing-virtualization
> >
> > 2nd leading hypervisor listed nested virtualization to be not used for
> "performance sensitive applications".
> 
> Another concept shift.
> 
> I'm not going to comment on the choice for individual distros. But the points are
> whether we can deploy a nesting virtualization easily under a specific hardware
> architecture. In this regard, the above is a good example.
>
And most of such nesting seems for non production use, helpful for debugging and more.

And the nesting is not working without trap + emulation for > 2 level of nesting outside of virtio as far as I understand.
Like Intel PML. How many levels of nesting is done by hw for PML?
 
> Again, just a simple google will tell you the instances that support nesting have
> been available for almost all the major cloud vendors for a while.
> 
From cpu data sheets, it does not appear that hw is able to do such nesting.

> >
> > I want to repeat and emphasize that I am not ignoring the nested case.
> >
> > An extension for nesting would be the VF presented to the guest itself with
> SR-IOV capability can work as_is as proposed here.
> 
> How can a VF have the SR-IOV capability?
>
One option is by trap + emulation.
Second is having it actually on the VF, which will follow the true definition of nesting.
 
> > Michael presented the idea of the dummy PF, which is to represent the VF as
> dummy PF which can do the SR-IOV with one VF.
> 
> Why do we need the complicated SR-IOV emulation at the nesting level?
You have to complicate one way or the other.
And here it does not look complicated because it uses all existing defined constructs available at VMM and GVM level.
It follows both the principles you listed in the paper, i.e. (a) efficiency and (b) equivalence property.

> How can you make sure such a design can result in a live migration to be done at
> any levels?
>
I will propose design that is practical and has some use case.
I will not propose theoretical work that no one will implement.
 
> E.g in LN, you had a PF and a VF. How to migrate the PF to this level?
> You want two PFs in the L(N-1) level?
> 
Likely yes as dummy PF with emulated caps.

> > You need the support from the platform too, I guess TC can extend it.
> > May be a different interface more suitable for nested case which do not have
> performance needs.
> 
> I disagree, it's about if the performance can satisfy the requirement at N level.
> 
> >
> > How about a nested user to have AQ located on the VF so that mediation sw
> can operate admin commands over self?
> 
> I would go with such complicated architecture.
>
You like meant, you wouldn't, Right?

Also, following your paper which clearly highlights, "execution of privileged instruction in vm occurs, which would have effect of changing machine resources".
In the passthrough case it is not the privileged instruction because the resource is not composed by the the machine, it is already done by the device".
Hence for such cvq operation trap is not to be done for member virtio device.

It would make sense to trap cvq for non virtio device, where cvq is composed as part of the machine resource.
 
> > Device mode commands will not be applicable there, instead some other
> things to be done.
> > So non passthrough mode software possibly can make use of it?
> 
> It would be a great burden if you
> 
> 1) use passthrough in L0
> 2) use trap/emulation in L(N+1)
>
How is this different than Intel PML hw?
 
> >
> > > That is to say for any CPU/hypervisor vendors, the architecture
> > > should be designed to run any levels of nesting instead of just an
> > > awkward 2 levels (but what you proposed can not work for even 2).
> > Huh, some missing text for corner case as making claim, _not_working in not a
> healthy discussion.
> >
> > > For x86 and KVM, any level of
> > > nesting has been done for about 10 years ago.
> > >
> > I didnât find hw for PML support in x86 for N or 3 level nesting. Did I miss?
> > I didnât find hw for nested page tables upto N level walking on the PCIe
> read/writes in any cpu. Did I miss?
> 
> You need first asking why it is a must to achieve nested virtualization. All of
> those obstacles come only if you want to use "passthrough" for any levels.
> 
> > Have you seen nesting in hw works at N level?
> 
> Again, hardware can't have endless resources for endless levels. 
Can you please list two or 3 hw features that are in hw, for > 2 levels?

> Trap and
> emulation is a must for achieving nesting virtualization. If you try to invent a
> passthrough method that can work for any level, you will probably fail

It at least follows the design principle of the paper you suggested.
I donât see a point of designing something for N level nesting in first go when rest eco system is not there to support it at hw level.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]