[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] RE: [PATCH V2 2/6] virtio: introduce SUSPEND bit in device status
On 11/6/2023 6:52 PM, Parav Pandit wrote:
From: Zhu, Lingshan <lingshan.zhu@intel.com> Sent: Monday, November 6, 2023 2:51 PM On 11/6/2023 12:07 PM, Parav Pandit wrote:From: Zhu, Lingshan <lingshan.zhu@intel.com> Sent: Monday, November 6, 2023 9:00 AM On 11/3/2023 11:54 PM, Parav Pandit wrote:From: Zhu, Lingshan <lingshan.zhu@intel.com> Sent: Friday, November 3, 2023 8:25 PM On 11/3/2023 7:35 PM, Parav Pandit wrote:From: Zhu Lingshan <lingshan.zhu@intel.com> Sent: Friday, November 3, 2023 4:05 PM This patch introduces a new status bit in the device status: SUSPEND. This SUSPEND bit can be used by the driver to suspend a device, in order to stabilize the device states and virtqueue states. Its main use case is live migration. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>You constantly complained that whatever was proposed using admincommands method in [1] must work for passthrough and non-passthrough.And halfway in the discussion you propose a method after learning all thelimitations of in-band, you propose a solution only works for non-passthrough mode.You asked someone to have comprehensive proposal and when it comes toyou following it, you just donât. not sure what you are talking about.And have most shallow commit message to not even mention it. Please be consistent in design approach. And if you donât want to be, stop asking others.this SUSPEND/RESUME doesn't change since the RFC series, how can it not be inconsistent???This is not the way TC collaboration works. I probably shouldnât even expect this from you.Your proposal does not cover both the use cases of passthrough and non-passthrough.Yet you kept demanding them for others. This is just wrong. I am aware that both models as technical pros and cons.Why this doesn't work? the device status byte has been working for many years, and do you know when guest freeze, the hypervisor owns thedevice????When the guest is not frozen and during the pre-copy phase, hypervisor needsto access the device (context, dirty pages).How does it work if the guest owns the device?Have you seen PASID there?PASID does not help because as explained virtio common config space and device specific config space is owned by the guest driver. Secondly PASID space is also owned by the guest driver.
hypervisor sets a PASID to isolate the cap.
as explained before, PCI and CPU supports atomic read/write. Please refer to PCI spec and CPU ISA.[1] https://lists.oasis-open.org/archives/virtio-comment/202310/msg004 72 .h tmlPlease don't be so emotional and please be professional. Why this solution can not work for pass-through? Do you know the device ownership will be transferred to the hypervisor when guest suspended in live migration?I explained 5 reasons why it does not work in previous reply. As the word indicates "live migration", the hypervisor needs to access thedevice when it is "live" (not just after).Hence, passthrough mode must be able to capture the state of the device anddirty pages database when its live.(and after the source is suspended).No, the hypervisor should only collect dirty pages when the device alive.It is needed during both the times. When the device and guest is live during pre-copy phase. And after the device is frozen, to get the final round of pages.With PASID, dirty page tracking facility can be isolated from the guest, means the hypervisor owns this facility. So the hypervisor can collect the dirty pages. When the device suspended, it should report the last round of dirty pages through dirty page tracking facility as expected. This can work, right?Unfortunately no, as non atomic bitmap cannot reside in the host memory,
And whatever is in the device gets reset on device reset and/or FLR. So the dirty map detail is lost. Similarly the device context is also lost on these two events triggered by guest.
we explained before, when reset, the device should clear everything.
As you can see, the dirty page tracking facility has a PASID for isolation. But still, the question is, we should better use platform dirty page trackingNothing to do with PASID, as PASID is owned by the guest.It looks you don't know how PASID work. Host can setup PASID to isolate some facilities, right?There are few limitations with PASID. a. All platforms do not have PASID and
As we have explained for many times, this is a basic facility, and the implementation is transport-specific. We given an example of PCI implementation, and PCI support PASID, right?
With a PASID, a cap can be considered to be placed in another logical address space, which is not accessible to the guest.b. I explained above PASID do not work always as PASID only bifurcates DMA not the device _functionality_.
c. PASID to be available to guest as_is what is present on the device
host hypervisor sets the PASID, transparent to the guest.
Then suspend the device after guest freeze, to stabilize the device status, then read the status. How can you say this does not work???I explained above.see above
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]