[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
On 2020/12/29 äå9:20, Halil Pasic wrote:
On Mon, 28 Dec 2020 14:47:21 +0800 Jason Wang <jasowang@redhat.com> wrote: [..]Right. The reason why we have to wait for the completion of the channel program is exactly that. Furthermore if we have to fail setting FEATURES_OK we fail the channel program, that was supposed to do the operation with an unit check. So at the point where the set_status() is done, we already know if it worked or not.The only difference is, for FEATURES_OK, driver know it won't work any more. But for DEVICE_STOPPED, driver know it might work sometime in the future.If the channel program fails with a command reject, that that ccw is not allowed to cause further side effects. I.e. we can not fail the channel program, but then make the operation succeed later.Of course the device can change it's status any time. Since we have a synchronous API setting DEVICE_STOPPED would also have to block until all in-flight requests are completed. But this does not see to be what you want. I understood that virtio_add_status(dev, DEVICE_STOPPED) would return immediately, and so would a subsequent config->get_status(dev) without the DEVICE_STOPPED bit set, if there are still outstanding requests.Yes.I don't think defining virtio_add_status(dev, FEATURES_OK) (followed by a readback) as synchronous and virtio_add_status(dev, DEVICE_STOPPED) (followed by a readback) as asynchronous is a good idea.The main reason is that DEVICE_STOPPED is a one of the possible state. We'd better do this by extending the current device status state machine. Otherwise it would be more complicated (e.g using another interface). It looks to me what you don't like is the driver API design? If yes, how about introduce something like virtio_try_add_status()? The difference is that it doesn't require a immediate result.Yes a new (Linux) driver API operation sounds good -- at least on the surface. I would rather call it virtio_add_status_async() than virtio_try_add_status() because for me 'try' tells that something may or may not work, but the outcome is decided when the function returns. E.g. try_lock(l) may or may not succeed the lock l, but when it returns you know if it did take l or not.
I agree.
But how would you implement that new operation? I guess it would still have to boil down to config->set_status(dev), and to set_status(s), being asynchronous when s & DEVICE_STOPPED and synchronous otherwise (and in particular FEATURES_OK is added to the status).
Or it could be done via a dedicated config ops like how vp_reset() did.
Ultimately a set_status(b) boils down to a iowrite8(b) for the PCI transport, and to a channel program that references a single byte buffer with the value of b for Channel IO transport. Making that synchroneaous or asynchronous depending on the value of b does not sound right.
Just to make sure we're in the same page. I meant e.g in CCW_CMD_WRITE_STATUS, qemu just propagate the request of device stop to device and return success. And it's the device that decide when it can set the DEVICE_STOPPED bit and then the next CCW_CMD_READ_STATUS could read the bit.
Maybe Conny can chime in on this later as well. [..]AFAIK x86 uses normal memory access instructions. I suppose when a store instruction does not fail, then the store must happen. Or is that only true for RAM and for PCI memory not?AFAIK, PCI have non-posted write which can fail but the spec doesn't requires that.So for a posted write a PCI device can just silently discard writes it does not 'like' (i.e. pretend there was no such transaction in the first place)?
I might be wrong, but it's my understanding as well.
[..]So did PCI: ÂÂÂ case VIRTIO_PCI_STATUS: ÂÂÂÂÂÂÂ if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) { ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_stop_ioeventfd(proxy); ÂÂÂÂÂÂÂ } ÂÂÂÂÂÂÂ virtio_set_status(vdev, val & 0xFF); ÂÂÂÂÂÂÂ if (val & VIRTIO_CONFIG_S_DRIVER_OK) { ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_start_ioeventfd(proxy); ÂÂÂÂÂÂÂ } ÂÂÂÂÂÂÂ if (vdev->status == 0) { ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_reset(DEVICE(proxy)); ÂÂÂÂÂÂÂ } As you said, it's a implementation property since qemu implements a synchronous virtio_set_status(). But it might not be true for real hardware which may require more time or even hard to drain a queue in short time. That's why PCI spec mandate the driver to poll for the status after writing zero to device status.That is a PCI specific thing, and unfortunately my PCI skills are very limited. Please compare static void vp_reset(struct virtio_device *vdev) { vp_iowrite8(0, &vp_dev->common->device_status); [..] while (vp_ioread8(&vp_dev->common->device_status)) msleep(1); With int virtio_finalize_features(struct virtio_device *dev) { [..] virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK); status = dev->config->get_status(dev); if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { The latter clearly assumes that the first get_status() will yield the final outcome, where the former clearly assumes that the effect of set_status() may be delayed beyond the first subsequent get_status().
I think the reason is that. For PCI, the spec forces the device to either set or clear the FEATURE_OK during write. But for reset, device may choose to delay setting status to 0 after the write.
Btw, if the device were to to fail resetting itself, what would you expect to happen? I would expect the device to present a status DEVICE_NEEDS_RESET, but that would effectively make vp_reset() getting stuck in that loop.
Yes, then it looks like a recursion here.
Or does the spec guarantee that a reset must work eventually?
My understanding is that the spec implies it must succeed eventually.
[..]PCI can use new filed. But I wonder how can that help for CCW.Using a new mechanism to request the device to stop would help with the synchronous vs asynchronous stuff. I.e. we could keep the FEATURES_OK synchronous, and make REQUEST_STOP asynchronous.Right, but as said before, it looks more like a part of device status state machine, using new mechanism might bring extra complexity and confusion. Looking at CCW implementation of device status, I don't see any blocker for implementing it synchronously if we just wait for the stop of the device and then return?I don't see a problem on the qemu side, but I do see a problem on the guest side. There is a reason why you want set DEVICE_STOPPED to be asynchronous.
So my understanding is that. at least for current kernel virtio drivers, it doesn't need DEVICE_STOPPED since there's no actual use cases. So no need for inventing new helpers.
It would be only need for the vDPA drivers that is used by userspace via vhost-vDPA. In this case, we just provide the mechanism to write and read status, vDPA driver don't need to enforce any policy in the middle (e.g polling or not).
Thanks
Regards, Halil [..]
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]