OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio-comment] [PATCH V2] virtio-net: introduce admin control virtqueue

On 2021/1/25 äå6:22, Haozhong Zhang wrote:
On 01/25/21 15:48, Jason Wang wrote:
On 2021/1/25 äå3:05, Haozhong Zhang wrote:
On 01/25/21 14:45, Jason Wang wrote:
On 2021/1/25 äå2:31, Haozhong Zhang wrote:
On 01/25/21 13:52, Jason Wang wrote:
When implementing virtual devices like SR-IOV or sub-function. We're
suffering from several issues:

- There's no management interface for management device to
     configure features, attributes for a virtual device
- Per virtual device control virtqueue could be very expensive as the
     number of virtual devices could be very large
- Virtualize per virtual device's control virtqueue could be very
     challenge as we need the support of DMA isolation at queue level

So this patch introduces the feature of VIRTIO_NET_CTRL_ADMIN_VQ. This
allows the device to implement a single admin control virtqueue to
manage the features and attributes for a specific virtual device.

The idea is simple, a new virtual device id is introduced on top of
the existing virtio_net_ctrl structure. This id is transport or device
specific to uniquely identify a management or virtual device.

With this, we get a way of using management device (PF) to configure
per virtual device features and attributes. And since the admin
control virtqueue belongs to management device (PF), the DMA is
naturally isolated at device level instead of the queue level for per
virtual device control vq.

When the admin cvq is offered by management device and normal cvq is
offered by virtual device. A new command class is introduced decide
whether or not to accept commands from normal cvq for a virtual

Signed-off-by: Jason Wang <jasowang@redhat.com>
Changes from V1:
- use 'virtual device' instead of 'function'
- introuce trust command
- clairfy that the admin cvq could be used to configure management
     device itself
    content.tex | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++---
    1 file changed, 56 insertions(+), 3 deletions(-)

diff --git a/content.tex b/content.tex
index 620c0e2..989b4f6 100644
--- a/content.tex
+++ b/content.tex
@@ -2940,6 +2940,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
    \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
+\item[VIRTIO_NET_F_ADMIN_CTRL_VQ(56)] Admin control channel is
+    available.
    \item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-packet hash
        value and a type of calculated hash.
@@ -3840,11 +3843,12 @@ \subsubsection{Processing of Incoming Packets}\label{sec:Device Types / Network
    \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Device Operation / Control Virtqueue}
    The driver uses the control virtqueue (if VIRTIO_NET_F_CTRL_VQ is
-negotiated) to send commands to manipulate various features of
-the device which would not easily map into the configuration
+negotiated but VIRTIO_NET_F_ADMIN_CTRL_VQ is not negotiated) to send
+commands to manipulate various features of the device which would not
+easily map into the configuration space.
    All commands are of the following form:
+is not negotiated:
    struct virtio_net_ctrl {
@@ -3864,6 +3868,29 @@ \subsubsection{Control Virtqueue}\label{sec:Device Types / Network Device / Devi
    do except issue a diagnostic if \field{ack} is not
+negotiated, the driver can use the admin control virtqueue of the
+management device to manipulate features of individual virtual devices
+where the control virtqueue is not easily implemented. The definition
+of management device and virtual device is transport or device
+specific. E.g in the case of PCI SR-IOV, the management device is
+implemented via the physical function (PF), then the virtual device is
+the virtual function (VF) in this case.
+All commands are of the following form:
+struct virtio_net_admin_ctrl {
+        u32 virtual_device_id;
+        struct virtio_net_ctrl ctrl;
+The \field{virtual_device_id} is an unique transport or device specific
+identifier for a virtual device or management device. E.g for the case
+of PCI SR-IOV, it's the PCI function id. Management device MUST
+reserve 0 for \field{virtual_device_id} to identify itself.
Hi Jason,

One question about the valid scope of virtual_device_id. Consider a
case, that

0. A VF with PCI address "xx:yy.z" exists
1. The virtio driver commits an admin command with virtual_device_id
      "xx:yy.z" admin queue
2. Before the device processes the command, the virtio driver disables
      PF SRIOV, then re-enables it, and then creates a VF with the same
      address as "xx:yy.z".

Now, should the command committed in step 1 still be considered as
valid after step 2? If not, shall there be a flush or abort operation
to allow the driver to drop commands between step 1 and 2? Or simply
let the driver just wait until all admin commands have been processed.
Good point. My understanding is:

  From device level: the effect of the command ties to the lifecycle of the
device (e.g VF). So if virtual device (VF) is destroyed (e.g SR-IOV is
disabled), the command won't be effective any more even if it is re-created
later. And if the virtual device is destroyed during processing the admin
command, the management device (PF) must fail the command.

  From driver level: The driver should synchronize between the admin control
vq processing and virtual devices create/destroy to avoid unexpected

Does this make sense?

I think we can clarify this in the spec.
I guess the above virtual device destroy also includes the case that
the virtual device is already in NEEDS_RESET status.

We can provide that, but in your case the virtual device is going to be
destroyed soon, I'm not sure how much it can help.

If so, all make
sense. I think the driver can synchronize easily, for example by just
waiting all commands targeted to that VF failed.

Before disabling SR-IOV, PF driver needs to send the notification to VF
drivers to sync admin cvq commands with its shutdown.
I think this notification can be used by the driver as a "barrier":
- all commands for this VF before the barrier that have not been
   completed can be ignored (e.g., if a failure status is later
   returned for such a command, the driver will not need to do anything
   as the device has been destroyed)
- all commands for this VF after the barrier are for the new (or
   re-enabled) VF.

Then the driver would not need to busy wait for all commands before
the barrier to complete before it can perform other operations on that
VF (e.g., disable/re-enabling...).

Just to make sure we are in the same page. For "notification", I meant a software approach to sync SRIOV disabling with VF driver shutdown.

For sync with VF. In the linux kernel implementation of pci_disable_sriov(). It will shutdown VF drivers before disabling the SRIOV via config space. So for VF it works like before, the driver needs to use subsystem specific way to sync between device shutdown and cvq command. E.g in the virtio-net, we sync through RTNL lock (see virtnet_remove). So it was guaranteed that the VF is not destroyed when we're sending and waiting for the result of VF.

For sync with SRIOV enable/disable with admin cvq. It's the responsibility of the driver to deal with the synchronization correctly. (E.g in linux and virtio-net RTNL is a good candidate). At the device level, it's not guaranteed that the driver is properly destroyed. So when the device is receiving admin cvq command when the virtual device is not existed, the device must fail this command.

So my understanding is that for the case of Linux virito-net, RTNL lock is the "barrier" that you want.


I'm not sure which way
is more easier either of your proposal should work:

1) busy wait for the completion of the command
2) let the PF to deal with the possible failure



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/

- Haozhong Zhang

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]