OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [PATCH v2] Add device reset timeout field



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, October 8, 2021 4:48 PM
> 
> On Fri, Oct 08, 2021 at 10:51:02AM +0000, Parav Pandit wrote:
> > > That's why I ask: why do we bother? What's wrong with just waiting
> > > forever or until user gets tired of this and cancels with CTRL-C?
> >
> > Today, device removal of the device gets stuck for the device which didn't
> finish the reset, because its waiting for ever.
> >
> > modprobe to my knowledge cannot be Ctrl-C. In another scenario, device
> probing of hot plug device occurs by hotplug driver in a workqueue context.
> 
> Frankly I'm not sure we need to worry about esoterica like hotplug working
> when device can't get out of reset. 

A linux virtio driver is not even able to pass a developer's basic test where 30 devices are hotplugged and unplugged.
This is because the driver is waiting infinitely for device to finish the reset, and a device cannot tell that it has 1sec timeout.

> 
> So if it's device removal you are after to fix, then the proposed spec won't be
> enough I suspect, since there's no specific time when we can be sure DMA
> won't happen anymore. Just giving up on device isn't possible, if you do you risk
> corrupting guest memory which seems scarier than just blocking hotplug.
> 
You are right, if the device is running and reset is initiated, and if timeout expires, it is better to leave the device the state it is in.

However as explained in other examples where device is undergoing its first reset during the spec defined initialization sequence, there is no reason for it to wait infinitely when DMA/interrupts are not even ever initiated or even features negotiation didn't even finish.
Right?

> I guess with this in mind, what would be needed is a more fine-grained
> approach. E.g. driver writes 0 to reset, device returns 1 to indicate reset in
> progress, at that point it can promise that DMA/interrupts won't happen any
> longer, so driver can go away.
> 
How is this helping? This is what happens today, no?
But it doesn't indicate that DMA/interrupts won't continue anymore.
I must be missing something.
I thought device would set DEVICE_NEEDS_RESET if it encountered error during reset or otherwise.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]