Subject: Re: [virtio] [PATCH v2] initialization: add extra device status handshake
On Wed, Sep 25, 2013 at 11:54:07AM +0930, Rusty Russell wrote: > "Michael S. Tsirkin" <email@example.com> writes: > > In the past we sometimes wanted an ability to make driver > > gracefully fail early in initialization if it does not > > recognize/support a feature. One such example we are introducing > > right now is the new VERSION_1 feature. > > > > Another example where the driver configuration > > is inconsistent with what the device can support. > > > > This modifies the initialization to allow > > graceful failure in such cases: > > 1. require a driver to first ack all features it recognizes > > (linux does this already) > > No it doesn't, but yes it should :) > > > 2. require driver to set status after configuration > > and re-check status afterwards > > I like this, but I prefer the simpler approach: I'll re-read carefully a bit later, overall this looks okay I think. > 1) Call the new status bit FEATURES_OK. This makes it fairly clear that > it's the end of feature negotiation. > > 2) The driver SHOULD re-read the status to check FEATURES_OK has > "stuck". > > How's this? > > diff --git a/virtio-v1.0-wd01-part1-specification.txt b/virtio-v1.0-wd01-part1-specification.txt > index b4298bb..83042c4 100644 > --- a/virtio-v1.0-wd01-part1-specification.txt > +++ b/virtio-v1.0-wd01-part1-specification.txt > @@ -106,6 +106,9 @@ This field is 0 upon reset, otherwise at least one bit should be set: > may be a significant (or infinite) delay before setting this > bit. > > + FEATURES_OK (8) Indicates that the driver has acknowledged all the > + features it understands, and feature negotiation is complete. > + > DRIVER_OK (4) Indicates that the driver is set up and ready to > drive the device. > > @@ -444,23 +447,44 @@ how to communicate with the specific device. > > 3. The DRIVER status bit is set: we know how to drive the device. > > -4. Device-specific setup, including reading the device feature > - bits, discovery of virtqueues for the device, optional per-bus > - setup, and reading and possibly writing the device's virtio > - configuration space. > - > -5. The subset of device feature bits understood by the driver is > - written to the device. > - > -6. The DRIVER_OK status bit is set. > - > -7. The device can now be used (ie. buffers added to the > - virtqueues) > +4. Device feature bits are read, and the the subset of feature bits > + understood by the OS and driver is written to the device. > + > +5. The FEATURES_OK status bit is set. > + > +6. The status byte is re-read to ensure the FEATURES_OK bit is still > + set: otherwise, the device does not support our subset of features > + and the device is unusable. > + > +7. Device-specific setup, including discovery of virtqueues for the > + device, optional per-bus setup, reading and possibly writing the > + device's virtio configuration space, and population of virtqueues. > + The device is now "live". > + > +8. The DRIVER_OK status bit is set. This is simply informational, > + as the device may already be active at this point. So we have a separate issue I think: there's lots of device specific configuration that really would benefit from "OK now go" bit. For example, we register network device and we really don't want e.g. config interrupts before that is done. Also qemu relied on DRIVER_OK to kick off vhost net and vhost scsi. Finally when we say it's "simply informational" someone might think this means "not mandatory" :) Is it reasonable to change the "simply informational" part and require that devices not consume buffers and not send config interrupts before they get DRIVER_OK? > > If any of these steps go irrecoverably wrong, the guest should > set the FAILED status bit to indicate that it has given up on the > device (it can reset the device later to restart if desired). > > +Devices should support all valid combinations of features, but we know > +that implementations may well make assuptions that they will only be > +used by fully-optimized drivers. The resetting of the FEATURES_OK flag > +provides a semi-graceful failure mode for this case. > + > +184.108.40.206. Legacy Interface: Device Initialization > +--------------------------- > +Legacy devices do not support the FEATURES_OK status bit, and thus did > +not have a graceful way for the device to indicate unsupported feature > +combinations. It also did not provide a clear mechanism to end > +feature negotiation, which meant that devices finalized features on > +first-use, and no features could be introduced which radically changed > +the initial operation of the device. > + > +The result was the steps 5 and 6 were omitted, and steps 4 and 7 > +were conflated. > + > 2.2.2. Device Operation > ---------------------- > > @@ -2863,10 +2887,6 @@ the non-PCI implementations (currently lguest and S/390). > >  The actual value within this range is ignored > > - Historically, drivers have used the device before steps 5 and 6. > -This is only allowed if the driver does not use any features > -which would alter this early use of the device. > - >  ie. once you enable MSI-X on the device, the other fields move. > If you turn it off again, they move back! > > > Cheers, > Rusty.