virtio-comment message

Subject: Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES

From: "Chen, Jiqian" <Jiqian.Chen@amd.com>
To: Parav Pandit <parav@nvidia.com>
Date: Mon, 15 Jan 2024 08:20:45 +0000

On 2024/1/15 15:55, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:18 PM
>>
>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>
>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>
>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>> Hi Jiqian,
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>> Sorry to reply late.
>>>>>>>>>> I don't know if you still remember this problem, let me briefly
>>>>>>>>>> descript
>>>>>> it.
>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>> It just shown a black
>>>>>>>> screen.
>>>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>>>> made the display gone after guest resumed.
>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>>>>>> parameter to 0, and then qemu will keep the normal actions.
>>>>>>>>>> That can
>>>>>> help guest's display come back.
>>>>>>>>>> When I upstream above implementation, Parav and MST suggest me
>>>> to
>>>>>>>> use
>>>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>>>> command or state bit.
>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
>>>>>>>>>> used to solve this problem.
>>>>>>>>>> The reason is:
>>>>>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>>>>>> preserver_resource to 1), but during the process of resuming,
>>>>>>>>>> the state of PM cap will be cleared by qemu
>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
>>>>>>>>>> when guest reads state from PM cap, it will find the virtio-gpu
>>>>>>>>>> has already been D0 state,
>>>>>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>>>>>> 2-bit field is
>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>>>>>> current qemu code? Why?
>>>>>>> Because PM is implementing the support from D3->d0 transition and
>>>>>>> if it
>>>>>> device in D3, the device needs to respond that it is in D3 to match
>>>>>> the PCI spec.
>>>>>>>
>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>> during the process of virtio-gpu reset?
>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>>>> other
>>>>>> things listed in pci spec.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu
>> is
>>>>>>>> reasonable.
>>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
>>>>>>> set the
>>>>>> cap accordingly to achieve the restore.
>>>>>>>
>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>> resume the
>>>>>> device soon.
>>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>>>> In such case, the extra register wouldnât be necessary.
>>>>>>> Having PM control register is anyway needed to drive the device
>> properly.
>>>>>> Even as you said, the reset behavior of PM in the current QEMU code
>>>>>> is incorrect, and even if it is modified, it also cannot fix this problem with
>> S3.
>>>>> You meant D3?
>>>>>
>>>>>> Because there is twice device reset during resuming.
>>>>>> First is when we trigger a resume command to qemu, qemu will reset
>>>>>> device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>> A device implementation that supports PM should not reset the device.
>>>> Please fix it.
>>>> Do you mean the devices_reset behavior of qemu shouldn't happen when
>>>> a device has PM cap?
>>> Right. Device reset should not happen on PM capable device that offers
>> no_soft_reset==1.
>> But current qemu code, no_soft_reset==0.
>>
> So please fix the qemu when you are adding the functionality of restoring the device context on PM.
My patch didn't add the functionality of restoring the device context on PM.
Is no_soft_reset must set 1 when a device has PM cap? It should also be allowed for situations that are not equal to 1, right?
There should be devices that has PM cap but no_soft_reset is 0 because that devices can't store and restore the device context on d3->d0 transition, like current qemu code.
So the fix should be feasible for all situations that whether the no_soft_reset is 1 or not.
If use PM cap, not every situation can solve current problem (display resources of GPU will be destroyed).

> 
>>>
>>>> Or just the state of PM cap shouldn't be reset?
>>>>
>>>>>
>>>>>> After the first time, then second time happens in guest kernel,
>>>>>> virtio_device_restore-> virtio_reset_device. But the PM state
>>>>>> changes
>>>>>> (D3 to
>>>>>> D0) happens between those two, so the display resources will be
>>>>>> still destroyed by the second time of device reset.
>>>>> The driver that understands that device supports PM, will skip the
>>>>> reset
>>>> steps.
>>>> But it happens for now.
>>> This should be enhanced in the guest driver.
>>>
>>>> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of
>>>> virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.
>>>>
>>> It should be set when/if the device can store and restore the device context
>> on d3->d0 transition.
>> I don't know why it doesn't be set in current qemu code.
> May be because QEMU is not restoring the context on PM commands.
> 
>> What do you mean "the device can store and restore the device context on
>> d3->d0 transition"? The real physical device or the simulated devices on qemu
>> side?
> Does not matter, whichever implementation of device is doing the PM should set it.
> In your case it is qemu.
> 
>> How do I confirm it?
> If you mean "I" means guest driver, it will just work as long as it refers to the PM capabilities.
> 
>>
>>>
>>>>> Hence the 2nd reset will also not occur.
>>>>> Therefore, device state will be restored.
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

Follow-Ups:
- RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: Parav Pandit <parav@nvidia.com>

References:
- Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: "Chen, Jiqian" <Jiqian.Chen@amd.com>
- RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: Parav Pandit <parav@nvidia.com>
- Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: "Chen, Jiqian" <Jiqian.Chen@amd.com>
- RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: Parav Pandit <parav@nvidia.com>
- Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: "Chen, Jiqian" <Jiqian.Chen@amd.com>
- RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: Parav Pandit <parav@nvidia.com>
- Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: "Chen, Jiqian" <Jiqian.Chen@amd.com>
- RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: "Chen, Jiqian" <Jiqian.Chen@amd.com>
- RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  - From: Parav Pandit <parav@nvidia.com>