OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [PATCH v2] Add device reset timeout field


Hi Michael, Cornelia,	

> From: Parav Pandit
> Sent: Tuesday, October 12, 2021 2:42 PM
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, October 12, 2021 2:32 PM
> >
> > On Tue, Oct 12, 2021 at 08:51:34AM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Monday, October 11, 2021 9:30 PM
> > > >
> > > > On Mon, Oct 11, 2021 at 03:44:14PM +0000, Parav Pandit wrote:
> > > > > > > This is unlikely to work the reset is completed. Because a
> > > > > > > real device
> > > > > > implementing this would prefer to do this in fw for 1000
> > > > > > virtio devices sitting on the physical card.
> > > > > > > And it is very much driven by such implementation at device devel.
> > > > > > > So it cannot update the counter value if reset is not
> > > > > > > completed for the
> > > > device.
> > > > > > >
> > > > > > > I think read only device reset timeout is most elegant
> > > > > > > option during device
> > > > > > initialization phase that eliminates infinite loop of today.
> > > > > >
> > > > > > Why can't a driver just go ahead and do a timeout regardless?
> > > > > o.k. lets consider this thought exercise. What is the timeout
> > > > > value that driver
> > > > will choose if device doesn't specify one?
> > > > > I explained in previous thread and you acked that actual fw
> > > > > based device
> > > > may take longer to initialize than pure sw implementation backend.
> > > > > In second example a pre-boot device can take even longer
> > > > > initialization
> > time.
> > > > > Sriov VF device may initialize lot faster.
> > > > > Instead of driver having such transport, and device specific
> > > > > checks, (or some
> > > > very short or very long timeout), we propose, that let device
> > > > mention such timeout value.
> > > >
> > > > Parav I think you are conflating reset with initialization time.
> > > > initialization is just for host boot which takes seconds anyway -
> > > > but no, minutes is not reasonable their, either.
> > > > reset affects guest boot. This needs to complete in milliseconds.
> > > >
> > > I cannot promise, but with newer generation devices usually
> > > functionality
> > improves.
> > > Enforcing in milliseconds doesn't look practical for type of devices.
> > > Some of the block devices may need to establish TCP connections in
> > > the
> > backend.
> > > It is more useful to wait for few more seconds to initialize device
> > > after power
> > on the system, instead of giving up booting the server completely.
> > > For example, a nvme block device starts with a minimum timeout of
> > 500msec.
> > >
> > > Yes, I agree to your point that a device given to a guest VM will
> > > likely have
> > very short reset time that should complete in milliseconds.
> > >
> > > > This conflation is IMHO one of the problems with this proposal.
> > >
> > > Device initialization consist of device reset from the spec section 3.1.1.
> >
> > It does. But maybe we need to create a way for driver to distinguish
> > between the two. When under reset, use a driver supplied timeout.
> This make sense, because as we discussed when device undergo a reset with
> active DMA, after timeout expires, driver still cannot cleanup.
> So this can be short driver decided value as longer timeout is not useful.
> 
> > When powering up, use a longer device supplied one.
> In v0, v1 I initially considered only the powering up case of the device
> initialization. There was text around that.
> And v2 I removed the initialization text, and I totally missed the above case with
> active DMA.
> This should work.
> We should word this part of the spec accordingly.

Below changes are good for v3?
1. driver should use device reset time during initialization stage
2. remove feature bit as feature bits are only readable after reset is completed
3. device reset timeout field of zero indicates that device doesn't support it.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]