OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] RE: [PATCH v2] Add device reset timeout field



å 2021/10/15 äå12:36, Parav Pandit åé:

From: Michael S. Tsirkin <mst@redhat.com>
Sent: Friday, October 15, 2021 3:59 AM

On Thu, Oct 14, 2021 at 05:35:37PM +0000, Parav Pandit wrote:
Hi Michael, Cornelia,

From: Parav Pandit
Sent: Tuesday, October 12, 2021 2:42 PM

From: Michael S. Tsirkin <mst@redhat.com>
Sent: Tuesday, October 12, 2021 2:32 PM

On Tue, Oct 12, 2021 at 08:51:34AM +0000, Parav Pandit wrote:

From: Michael S. Tsirkin <mst@redhat.com>
Sent: Monday, October 11, 2021 9:30 PM

On Mon, Oct 11, 2021 at 03:44:14PM +0000, Parav Pandit wrote:
This is unlikely to work the reset is completed. Because
a real device
implementing this would prefer to do this in fw for 1000
virtio devices sitting on the physical card.
And it is very much driven by such implementation at device
devel.
So it cannot update the counter value if reset is not
completed for the
device.
I think read only device reset timeout is most elegant
option during device
initialization phase that eliminates infinite loop of today.

Why can't a driver just go ahead and do a timeout regardless?
o.k. lets consider this thought exercise. What is the
timeout value that driver
will choose if device doesn't specify one?
I explained in previous thread and you acked that actual fw
based device
may take longer to initialize than pure sw implementation backend.
In second example a pre-boot device can take even longer
initialization
time.
Sriov VF device may initialize lot faster.
Instead of driver having such transport, and device specific
checks, (or some
very short or very long timeout), we propose, that let device
mention such timeout value.

Parav I think you are conflating reset with initialization time.
initialization is just for host boot which takes seconds
anyway - but no, minutes is not reasonable their, either.
reset affects guest boot. This needs to complete in milliseconds.

I cannot promise, but with newer generation devices usually
functionality
improves.
Enforcing in milliseconds doesn't look practical for type of devices.
Some of the block devices may need to establish TCP connections
in the
backend.
It is more useful to wait for few more seconds to initialize
device after power
on the system, instead of giving up booting the server completely.
For example, a nvme block device starts with a minimum timeout
of
500msec.
Yes, I agree to your point that a device given to a guest VM
will likely have
very short reset time that should complete in milliseconds.
This conflation is IMHO one of the problems with this proposal.
Device initialization consist of device reset from the spec section 3.1.1.
It does. But maybe we need to create a way for driver to
distinguish between the two. When under reset, use a driver supplied
timeout.
This make sense, because as we discussed when device undergo a reset
with active DMA, after timeout expires, driver still cannot cleanup.
So this can be short driver decided value as longer timeout is not useful.

When powering up, use a longer device supplied one.
In v0, v1 I initially considered only the powering up case of the
device initialization. There was text around that.
And v2 I removed the initialization text, and I totally missed the
above case with active DMA.
This should work.
We should word this part of the spec accordingly.
Below changes are good for v3?
1. driver should use device reset time during initialization stage
How does driver identify this though?
Existence of device_reset_timeout field in struct virtio_pci_common_cfg indicates that this field exists.
If device support it, it will place non zero value and driver knows that this field should be used.

2. remove feature bit as feature bits are only readable after reset is
completed 3. device reset timeout field of zero indicates that device doesn't
support it.

I'm not sure about 3. I think each transport will need its own way to do it.

For pci a value of zero indicates it isn't supported.
For mmio DeviceResetTimeout at offset 0x04c indicates same.
Currently only these 2 transports have the use.

So I propose: maybe a capability like this, with a timeout field?
Do you mean a new capability like say VIRTIO_PCI_DEVICE_TIMEOUT like VIRTIO_PCI_CAP_COMMON_CFG?
This will contain one or more timeout? For example with his proposal it contains only device reset timeout.
Later same capability will be further extended to contains command timeout too? Yes?

And within VMs, we can just do without, since it got out of reset once it will
surely get out of reset again...
Yes, VM might not need it. It is really the HV's choice to implement and not part of the virtio spec.


Well, this will break the migration between HW virtio and SW virtio.

Thanks


Our internal cloud passthrough a PF to the VM.
It is probably better to let HV to choose if they want to do ctrl+c or have timeout.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]