OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v2] Add device reset timeout field


On Thu, Oct 14, 2021 at 05:35:37PM +0000, Parav Pandit wrote:
> Hi Michael, Cornelia,	
> 
> > From: Parav Pandit
> > Sent: Tuesday, October 12, 2021 2:42 PM
> > 
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, October 12, 2021 2:32 PM
> > >
> > > On Tue, Oct 12, 2021 at 08:51:34AM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Monday, October 11, 2021 9:30 PM
> > > > >
> > > > > On Mon, Oct 11, 2021 at 03:44:14PM +0000, Parav Pandit wrote:
> > > > > > > > This is unlikely to work the reset is completed. Because a
> > > > > > > > real device
> > > > > > > implementing this would prefer to do this in fw for 1000
> > > > > > > virtio devices sitting on the physical card.
> > > > > > > > And it is very much driven by such implementation at device devel.
> > > > > > > > So it cannot update the counter value if reset is not
> > > > > > > > completed for the
> > > > > device.
> > > > > > > >
> > > > > > > > I think read only device reset timeout is most elegant
> > > > > > > > option during device
> > > > > > > initialization phase that eliminates infinite loop of today.
> > > > > > >
> > > > > > > Why can't a driver just go ahead and do a timeout regardless?
> > > > > > o.k. lets consider this thought exercise. What is the timeout
> > > > > > value that driver
> > > > > will choose if device doesn't specify one?
> > > > > > I explained in previous thread and you acked that actual fw
> > > > > > based device
> > > > > may take longer to initialize than pure sw implementation backend.
> > > > > > In second example a pre-boot device can take even longer
> > > > > > initialization
> > > time.
> > > > > > Sriov VF device may initialize lot faster.
> > > > > > Instead of driver having such transport, and device specific
> > > > > > checks, (or some
> > > > > very short or very long timeout), we propose, that let device
> > > > > mention such timeout value.
> > > > >
> > > > > Parav I think you are conflating reset with initialization time.
> > > > > initialization is just for host boot which takes seconds anyway -
> > > > > but no, minutes is not reasonable their, either.
> > > > > reset affects guest boot. This needs to complete in milliseconds.
> > > > >
> > > > I cannot promise, but with newer generation devices usually
> > > > functionality
> > > improves.
> > > > Enforcing in milliseconds doesn't look practical for type of devices.
> > > > Some of the block devices may need to establish TCP connections in
> > > > the
> > > backend.
> > > > It is more useful to wait for few more seconds to initialize device
> > > > after power
> > > on the system, instead of giving up booting the server completely.
> > > > For example, a nvme block device starts with a minimum timeout of
> > > 500msec.
> > > >
> > > > Yes, I agree to your point that a device given to a guest VM will
> > > > likely have
> > > very short reset time that should complete in milliseconds.
> > > >
> > > > > This conflation is IMHO one of the problems with this proposal.
> > > >
> > > > Device initialization consist of device reset from the spec section 3.1.1.
> > >
> > > It does. But maybe we need to create a way for driver to distinguish
> > > between the two. When under reset, use a driver supplied timeout.
> > This make sense, because as we discussed when device undergo a reset with
> > active DMA, after timeout expires, driver still cannot cleanup.
> > So this can be short driver decided value as longer timeout is not useful.
> > 
> > > When powering up, use a longer device supplied one.
> > In v0, v1 I initially considered only the powering up case of the device
> > initialization. There was text around that.
> > And v2 I removed the initialization text, and I totally missed the above case with
> > active DMA.
> > This should work.
> > We should word this part of the spec accordingly.
> 
> Below changes are good for v3?
> 1. driver should use device reset time during initialization stage

How does driver identify this though?

> 2. remove feature bit as feature bits are only readable after reset is completed
> 3. device reset timeout field of zero indicates that device doesn't support it.

I'm not sure about 3. I think each transport will need its own way to do it.

So I propose: maybe a capability like this, with a timeout field?
And within VMs, we can just do without, since it got out of reset once
it will surely get out of reset again...

-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]