virtio-comment message

Subject: Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP

From: Halil Pasic <pasic@linux.ibm.com>
To: Jason Wang <jasowang@redhat.com>
Date: Tue, 29 Dec 2020 14:20:58 +0100

On Mon, 28 Dec 2020 14:47:21 +0800
Jason Wang <jasowang@redhat.com> wrote:

> 
[..]
> > Right. The reason why we have to wait for the completion of the channel
> > program is exactly that. Furthermore if we have to fail setting
> > FEATURES_OK we fail the channel program, that was supposed to do the
> > operation with an unit check. So at the point where the set_status() is
> > done, we already know if it worked or not.
> 
> 
> The only difference is, for FEATURES_OK, driver know it won't work any 
> more. But for DEVICE_STOPPED, driver know it might work sometime in the 
> future.
> 

If the channel program fails with a command reject, that that ccw is not
allowed to cause further side effects. I.e. we can not fail the channel
program, but then make the operation succeed later.

> 
> > Of course the device can
> > change it's status any time.
> >
> > Since we have a synchronous API setting DEVICE_STOPPED would also have to
> > block until all in-flight requests are completed. But this does not see
> > to be what you want. I understood that
> > virtio_add_status(dev, DEVICE_STOPPED) would return immediately, and so would
> > a subsequent config->get_status(dev) without the DEVICE_STOPPED bit set,
> > if there are still outstanding requests.
> 
> 
> Yes.
> 
> 
> >
> > I don't think defining virtio_add_status(dev, FEATURES_OK) (followed by
> > a readback) as synchronous and virtio_add_status(dev, DEVICE_STOPPED)
> > (followed by a readback) as asynchronous is a good idea.
> 
> 
> The main reason is that DEVICE_STOPPED is a one of the possible state. 
> We'd better do this by extending the current device status state 
> machine. Otherwise it would be more complicated (e.g using another 
> interface).
> 
> It looks to me what you don't like is the driver API design? If yes, how 
> about introduce something like virtio_try_add_status()? The difference 
> is that it doesn'tÂ require a immediate result.
>

Yes a new (Linux) driver API operation sounds good -- at least on the
surface. I would rather call it virtio_add_status_async() than
virtio_try_add_status() because for me 'try' tells that something may
or may not work, but the outcome is decided when the function
returns. E.g. try_lock(l) may or may not succeed the lock l, but when it
returns you know if it did take l or not.

But how would you implement that new operation? I guess it would still
have to boil down to config->set_status(dev), and to set_status(s),
being asynchronous when s & DEVICE_STOPPED and synchronous otherwise (and
in particular FEATURES_OK is added to the status).

Ultimately a set_status(b) boils down to a iowrite8(b) for the PCI
transport, and to a channel program that references a single byte buffer
with the value of b for Channel IO transport. Making that synchroneaous
or asynchronous depending on the value of b does not sound right.

Maybe Conny can chime in on this later as well.

[..]
> >   AFAIK x86 uses normal
> > memory access instructions. I suppose when a store instruction does not
> > fail, then the store must happen. Or is that only true for RAM and for
> > PCI memory not?
> 
> 
> AFAIK, PCI have non-posted write which can fail but the spec doesn't 
> requires that.
> 

So for a posted write a PCI device can just silently discard writes it
does not 'like' (i.e. pretend there was no such transaction in the first
place)?

[..]
> 
> So did PCI:
> 
>  ÂÂÂ case VIRTIO_PCI_STATUS:
>  ÂÂÂÂÂÂÂ if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
>  ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_stop_ioeventfd(proxy);
>  ÂÂÂÂÂÂÂ }
> 
>  ÂÂÂÂÂÂÂ virtio_set_status(vdev, val & 0xFF);
> 
>  ÂÂÂÂÂÂÂ if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
>  ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_start_ioeventfd(proxy);
>  ÂÂÂÂÂÂÂ }
> 
>  ÂÂÂÂÂÂÂ if (vdev->status == 0) {
>  ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_reset(DEVICE(proxy));
>  ÂÂÂÂÂÂÂ }
> 
> As you said, it's a implementation property since qemu implements a 
> synchronous virtio_set_status(). But it might not be true for real 
> hardware which may require more time or even hard to drain a queue in 
> short time. That's why PCI spec mandate the driver to poll for the 
> status after writing zero to device status.
> 

That is a PCI specific thing, and unfortunately my PCI skills are very
limited.

Please compare 

static void vp_reset(struct virtio_device *vdev)                                
{                                                                               
	vp_iowrite8(0, &vp_dev->common->device_status);
[..]
        while (vp_ioread8(&vp_dev->common->device_status))                      
                msleep(1);

With 
int virtio_finalize_features(struct virtio_device *dev)                         
{                                                                               
[..]
        virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);                    
        status = dev->config->get_status(dev);                                  
        if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {  

The latter clearly assumes that the first get_status() will
yield the final outcome, where the former clearly assumes that
the effect of set_status() may be delayed beyond the first
subsequent get_status().

Btw, if the device were to to fail resetting itself, what would
you expect to happen? I would expect the device to present a
status DEVICE_NEEDS_RESET, but that would effectively make
vp_reset() getting stuck in that loop. Or does the spec guarantee
that a reset must work eventually?

[..]

> >> PCI can use new filed. But I wonder how can that help for CCW.
> > Using a new mechanism to request the device to stop would help with
> > the synchronous vs asynchronous stuff. I.e. we could keep the FEATURES_OK
> > synchronous, and make REQUEST_STOP asynchronous.
> 
> 
> Right, but as said before, it looks more like a part of device status 
> state machine, using new mechanism might bring extra complexity and 
> confusion.
> 
> Looking at CCW implementation of device status, I don't see any blocker 
> for implementing it synchronously if we just wait for the stop of the 
> device and then return?
> 

I don't see a problem on the qemu side, but I do see a problem on
the guest side. There is a reason why you want set DEVICE_STOPPED to
be asynchronous.

Regards,
Halil

[..]

Follow-Ups:
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>

References:
- [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Halil Pasic <pasic@linux.ibm.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Halil Pasic <pasic@linux.ibm.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Cornelia Huck <cohuck@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Cornelia Huck <cohuck@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Halil Pasic <pasic@linux.ibm.com>
- Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
  - From: Jason Wang <jasowang@redhat.com>