OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP



On 2020/12/29 äå9:20, Halil Pasic wrote:
On Mon, 28 Dec 2020 14:47:21 +0800
Jason Wang <jasowang@redhat.com> wrote:

[..]
Right. The reason why we have to wait for the completion of the channel
program is exactly that. Furthermore if we have to fail setting
FEATURES_OK we fail the channel program, that was supposed to do the
operation with an unit check. So at the point where the set_status() is
done, we already know if it worked or not.

The only difference is, for FEATURES_OK, driver know it won't work any
more. But for DEVICE_STOPPED, driver know it might work sometime in the
future.

If the channel program fails with a command reject, that that ccw is not
allowed to cause further side effects. I.e. we can not fail the channel
program, but then make the operation succeed later.

Of course the device can
change it's status any time.

Since we have a synchronous API setting DEVICE_STOPPED would also have to
block until all in-flight requests are completed. But this does not see
to be what you want. I understood that
virtio_add_status(dev, DEVICE_STOPPED) would return immediately, and so would
a subsequent config->get_status(dev) without the DEVICE_STOPPED bit set,
if there are still outstanding requests.

Yes.


I don't think defining virtio_add_status(dev, FEATURES_OK) (followed by
a readback) as synchronous and virtio_add_status(dev, DEVICE_STOPPED)
(followed by a readback) as asynchronous is a good idea.

The main reason is that DEVICE_STOPPED is a one of the possible state.
We'd better do this by extending the current device status state
machine. Otherwise it would be more complicated (e.g using another
interface).

It looks to me what you don't like is the driver API design? If yes, how
about introduce something like virtio_try_add_status()? The difference
is that it doesn't require a immediate result.

Yes a new (Linux) driver API operation sounds good -- at least on the
surface. I would rather call it virtio_add_status_async() than
virtio_try_add_status() because for me 'try' tells that something may
or may not work, but the outcome is decided when the function
returns. E.g. try_lock(l) may or may not succeed the lock l, but when it
returns you know if it did take l or not.


I agree.



But how would you implement that new operation? I guess it would still
have to boil down to config->set_status(dev), and to set_status(s),
being asynchronous when s & DEVICE_STOPPED and synchronous otherwise (and
in particular FEATURES_OK is added to the status).


Or it could be done via a dedicated config ops like how vp_reset() did.



Ultimately a set_status(b) boils down to a iowrite8(b) for the PCI
transport, and to a channel program that references a single byte buffer
with the value of b for Channel IO transport. Making that synchroneaous
or asynchronous depending on the value of b does not sound right.


Just to make sure we're in the same page. I meant e.g in CCW_CMD_WRITE_STATUS, qemu just propagate the request of device stop to device and return success. And it's the device that decide when it can set the DEVICE_STOPPED bit and then the next CCW_CMD_READ_STATUS could read the bit.



Maybe Conny can chime in on this later as well.

[..]
   AFAIK x86 uses normal
memory access instructions. I suppose when a store instruction does not
fail, then the store must happen. Or is that only true for RAM and for
PCI memory not?

AFAIK, PCI have non-posted write which can fail but the spec doesn't
requires that.

So for a posted write a PCI device can just silently discard writes it
does not 'like' (i.e. pretend there was no such transaction in the first
place)?


I might be wrong, but it's my understanding as well.



[..]
So did PCI:

  ÂÂÂ case VIRTIO_PCI_STATUS:
  ÂÂÂÂÂÂÂ if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
  ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_stop_ioeventfd(proxy);
  ÂÂÂÂÂÂÂ }

  ÂÂÂÂÂÂÂ virtio_set_status(vdev, val & 0xFF);

  ÂÂÂÂÂÂÂ if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
  ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_start_ioeventfd(proxy);
  ÂÂÂÂÂÂÂ }

  ÂÂÂÂÂÂÂ if (vdev->status == 0) {
  ÂÂÂÂÂÂÂÂÂÂÂ virtio_pci_reset(DEVICE(proxy));
  ÂÂÂÂÂÂÂ }

As you said, it's a implementation property since qemu implements a
synchronous virtio_set_status(). But it might not be true for real
hardware which may require more time or even hard to drain a queue in
short time. That's why PCI spec mandate the driver to poll for the
status after writing zero to device status.

That is a PCI specific thing, and unfortunately my PCI skills are very
limited.

Please compare

static void vp_reset(struct virtio_device *vdev)
{
	vp_iowrite8(0, &vp_dev->common->device_status);
[..]
         while (vp_ioread8(&vp_dev->common->device_status))
                 msleep(1);

With
int virtio_finalize_features(struct virtio_device *dev)
{
[..]
         virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
         status = dev->config->get_status(dev);
         if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {

The latter clearly assumes that the first get_status() will
yield the final outcome, where the former clearly assumes that
the effect of set_status() may be delayed beyond the first
subsequent get_status().


I think the reason is that. For PCI, the spec forces the device to either set or clear the FEATURE_OK during write. But for reset, device may choose to delay setting status to 0 after the write.



Btw, if the device were to to fail resetting itself, what would
you expect to happen? I would expect the device to present a
status DEVICE_NEEDS_RESET, but that would effectively make
vp_reset() getting stuck in that loop.


Yes, then it looks like a recursion here.


  Or does the spec guarantee
that a reset must work eventually?


My understanding is that the spec implies it must succeed eventually.



[..]

PCI can use new filed. But I wonder how can that help for CCW.
Using a new mechanism to request the device to stop would help with
the synchronous vs asynchronous stuff. I.e. we could keep the FEATURES_OK
synchronous, and make REQUEST_STOP asynchronous.

Right, but as said before, it looks more like a part of device status
state machine, using new mechanism might bring extra complexity and
confusion.

Looking at CCW implementation of device status, I don't see any blocker
for implementing it synchronously if we just wait for the stop of the
device and then return?

I don't see a problem on the qemu side, but I do see a problem on
the guest side. There is a reason why you want set DEVICE_STOPPED to
be asynchronous.


So my understanding is that. at least for current kernel virtio drivers, it doesn't need DEVICE_STOPPED since there's no actual use cases. So no need for inventing new helpers.

It would be only need for the vDPA drivers that is used by userspace via vhost-vDPA. In this case, we just provide the mechanism to write and read status, vDPA driver don't need to enforce any policy in the middle (e.g polling or not).

Thanks



Regards,
Halil

[..]




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]