OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits


On Fri, Jun 15, 2018 at 04:21:32PM +0200, Halil Pasic wrote:
> 
> 
> On 06/15/2018 03:39 PM, Tiwei Bie wrote:
> > On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> > > On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > > > 
> > > > > 
> > > > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > > > ---
> > > > > > v2:
> > > > > > - Refine the wording (Cornelia);
> > > > > > 
> > > > > > v3:
> > > > > > - Refine the wording (MST);
> > > > > > 
> > > > > >     content.tex | 7 +++++++
> > > > > >     1 file changed, 7 insertions(+)
> > > > > > 
> > > > > > diff --git a/content.tex b/content.tex
> > > > > > index f996fad..3c7d67d 100644
> > > > > > --- a/content.tex
> > > > > > +++ b/content.tex
> > > > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > > > >     of features the driver accepts, otherwise it MUST fail to set the
> > > > > >     FEATURES_OK \field{device status} bit when the driver writes it.
> > > > > > +If a device has successfully negotiated a set of features
> > > > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > > > +status} bit during device initialization), then it SHOULD
> > > > > > +NOT fail re-negotiation of the same set of features after
> > > > > > +a device or system reset.  Failure to do so would interfere
> > > > > > +with resuming from suspend and error recovery.
> > > > > > +
> > > > > 
> > > > > 
> > > > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > > > to assume that with a given device and a given driver (given, i.e.
> > > > > nothing changes) the two will always negotiate the same features
> > > > > (including the extremal case where the negotiation fails).
> > > > > 
> > > > > Either the device or a driver rolling a dice to make feature negotiation
> > > > > more fun seems quite unreasonable. So I assume this is not what we are
> > > > > bothering to soft prohibit here.
> > > > > 
> > > > > So the interesting scenario seems to be when stuff changes. When
> > > > > migrating the implementation of the device could change. Or something
> > > > > changes regarding the resources used to provide the virtual device.
> > > > > 
> > > > > But then, if the device really can not support the set of features
> > > > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > > > that is the difference compared to MUST).
> > > > > 
> > > > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > > > it did not click. I would appreciate some assistance.
> > > > 
> > > > It's exactly what it says. Let's say you negotiated a feature and then
> > > > device sets NEED_RESET.  Driver must now reset the device and put it
> > > > back in the same state it had before the reset, then resubmit
> > > > requests that were available but never used.
> > > > 
> > > > What if any of the features changed? Device suddenly
> > > > needs to check for requests which do not match the
> > > > features.
> > > > 
> > > > Suspend is similar: guests tend to assume hardware
> > > > does not change across suspend/resume, any changes
> > > > tend to make resume fail.
> > > > 
> > > 
> > > Thank you very much! But it still does not answer why would a device
> > > want to do that (fail to negotiate a feature that it was able
> > > to negotiate before). So I'm still in the dark about what are we
> > > trading for what.
> > 
> > Hi Halil,
> > 
> > Just like what you said, normally there is no reason
> > for a device to fail to negotiate a feature that it
> > was able to negotiate before. But the spec doesn't
> > forbid devices to do this , i.e. the spec allows a
> > device to fail to negotiate a feature that it was
> > able to negotiate before, which could cause problems
> > in some cases. Although everything works fine in
> > reality because there is no device would really do
> > this, it would be better to make spec to explicitly
> > forbid devices to do this in the necessary cases.
> > 
> > Best regards,
> > Tiwei Bie
> > 
> 
> I think we have most of it already covered with 'The device SHOULD
> accept any valid subset of features the driver accepts'.
> 
> IMHO what we add with your proposed normative statement is that
> if the device used to offer a feature bit it SHOULD keep offering it.
> That's clearly not covered by the by what I've cited.
> 
> But it's kind of covered by a non-normative statement 'Each virtio
> device offers all the features it understands.'

Well one has to squint very hard to understand it.
And note that "understands" is not the same as "supports". Device can
still fail to set FEATURES_OK.


> This seems most relevant in case of migration. That is device
> implementation S(ource) and device implementation T(arget) are
> migration compatible. But hey, features that are present
> in S and not present in T are of concern  for migration compatibility. AFAIK
> the VIRTIO specification does not make claims about migration
> compatibility.
> 
> So if I think QEMU, and somebody (maintainer) is deciding to remove support for
> of a certain device for a certain feature bit in the next version,
> he better thinks hard how could this breakmigration. I don't think
> the proposed normative statement with it's SHOULD would make the the
> guy more careful.
> 
> What is even more interesting is the scenario where the new version of
> the device does not remove support for a feature, but adds support for
> one, let's call it F_N.
> 
> The scenario is the following we have systems O(ld) and N(ew). We
> start on O then we migrate to new. There some reset of concern happens.
> Features get re-negotiated and we start exploiting F_N. In my reading
> of your addition, this is legit. But then we migrate back from N to O.
> No re-negotiation happens (because it is not obligatory), and things
> explode (hopefully, just migration fails, and not guest dies) because
> O does not have support for F_N. Your normative statement was nowhere
> violated as far as I can tell.

Oops I shouldn't even have started about migration.  Let's forget
migration. It's a simple question on what we can assume after we reset
device.

Some people want to be able to change features dynamically.
Is that OK? This text clarifies that no, it isn't.

> Bottom line is, I still don't know what benefit does this addition
> to the standard have to the implementer of the standard.

A question was asked. On suspend we save features and try to
restore them. Should driver handle device not offering some of these
features after resume? What this offers is a simple answer: don't
worry about it too much, devices have been warned that it's not a
good idea.



> In my opinion
> it's just another chunk of text that is hard to figure out. It's hard
> to tell what is the device

Most people know this I think

> and what is before

Sorry before what?


>, what is system reset.

I think many people do know what is a system reset.
It's an attempt to cover suspend to disk. How would you put it?


> If
> we were to make the spec complete with spelling out every 'don't make
> anything stupid' I'm under the impression there is a lot of work to
> do. I had a discussion here on the completeness of this spec, and
> completeness does not seem to be a primary goal. I'm still not
> sold on this one.
> 
> Regards,
> Halil

Yea, it's just that it's not clear that changing feature
bits when device is reset is all that stupid, since it
does after all lose its state.



> > > 
> > > Is there somewhere a patch that fixes such a bug? Maybe that would
> > > help me understand what can be done at the device to avoid the
> > > problem.
> > > 
> > > Regards,
> > > Halil
> > > 
> > > 
> > > > > 
> > > > > >     \subsection{Legacy Interface: A Note on Feature
> > > > > >     Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> > > > > >     Bits / Legacy Interface: A Note on Feature Bits}
> > > > > > 
> > > > 
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > > 
> > > 
> > 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]