virtio-comment message

Subject: Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration

From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
To: Parav Pandit <parav@nvidia.com>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>
Date: Thu, 12 Oct 2023 18:00:09 +0800



On 10/11/2023 6:54 PM, Parav Pandit wrote:

From: Zhu, Lingshan <lingshan.zhu@intel.com>
Sent: Wednesday, October 11, 2023 3:38 PM

The system admin can choose only passthrough some of the devices for
nested guests, so passthrough the PF to L1 guest is not a good idea,
because there can be many devices still work for the host or L1.

Possible. One size does not fit all.
What I expressed is most common scenarios that user care about.

don't block existing usecases, don't break the userspace, nested is common.

Nothing is broken as virtio spec do not have any single construct to support migration.
If nested is common, can you share the performance number with real virtio device with/without 2 level nesting?
I frankly donât know how they look like.

virtio devices support nested, I mean don't break this usecase

And end user accept performance overhead in nested, this is not relatedto this topic.

In second use case, where one want to bind only one member device to
one VM, I think same plumbing can be extended to have another VF, to
take

the role of migration device instead of owner device.

I donât see a good way to passthrough and also do in-band migration
without

lot of device specific trap and emulation.

I also donât know the cpu performance numbers with 3 levels of
nested page

table translation which to my understanding cannot be accelerated by
the current cpu.
host_PA->L1_QEMU_VA->L1_Guest_PA->L1_QEMU_VA->L2_Guest_PA and so

on,

there can be performance overhead, but can be done.

So admin vq migration still don't work for nested, this is surely a blocker.

In specific case of member devices are located at different nest level, it does

not.
so you got the point, so this series should not be merged.

Why prevents you have a peer VF do the role of migration driver?
Basically, what I am proposing is, connect two VFs to the L1 guest. One VF is

migration driver, one VF is passthrough to L2 guest.

And same scheme works.

A peer VF? A management VF? still break the existing usecase. and how do you
transfer ownership of L2 VF from PF to L1 VF?

A peer management VF which services admin command (like PF).
Ownership of admin command is delegated to the management VF.

interesting, do you plan to cook a patch implementing this?
Really make sense?

How do you transfer the ownership?
How to you maintain a different group?
How do you isolate the groups?
How to you keep the guest or host secure?
How do you manage the overlaps?
How do you implement the hardware support that?
How do you change the PCI routing?

On the other hand,
Many parts of the cpu subsystem such as PML, page tables do not have N

level nesting support either.
page tables could be emulated, as showed to you before, just PA to VA, nested
PA to nested VA

They all work on top of emulation and pay the price for emulation when

nesting is done.

May be that is the first version for virtio too.

there are performance overhead, but can be done.

I frankly feel that nesting support requires industry level eco system support

not just in virtio.

Virtio attempting to focus on nested and having nearly same level

performance as bare metal seems farfetched.

Maybe I am wrong, as we have not seen such high perf nested env even with

sw based device.

What can be possibly done is,
1. What admin commands are useful from this series that can be useful for

nesting?

2. What admin commands from current series needs extension for nesting?
3. What admin commands do not work at all for nesting, and hence, need to

have new commands.

If we can focus on those, maybe we can find common approach that cater to

both commands.
virtio support nested now, dont let your admin vq LM break this.

New spec addition is not breaking existing virtio implementation in sw.

don't break nested, again.

New spec additions of owner and member devices do not apply to non member and non owner devices.

if so, no member no owner, so no admin vq? then this proposal doesn'tmake any sense?

Do you know how does it work for Intel x86_64?
Can it do > 2 level of nested page tables? If no, what is the perf
characteristics

to expect?
of course that can be done, Page table is not a problem, there are
soft mmu emulation and viommu, through performance overhead.

Due to the performance overheads, I really doubt any cloud operator would

use passthrough virtio device for any sensible workload.

But you may know already how nested performance looks like that may be

acceptable to users.
Many tenants run their nested cluster. Don't break this.

How new spec addition such as crypto device addition broke net device?
Or how net vq interrupt moderation breaks existing sw?
It does not.
They are driven through their own feature bits and admin command capabilities.
It does not break any existing deployments.

we are talking about nested, don't break nested

Follow-Ups:
- RE: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: "Michael S. Tsirkin" <mst@redhat.com>

References:
- [PATCH v1 0/8] Introduce device migration support commands
  - From: Parav Pandit <parav@nvidia.com>
- [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: Parav Pandit <parav@nvidia.com>
- Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration
  - From: Parav Pandit <parav@nvidia.com>