virtio-dev message

Subject: Re: [virtio-dev] RE: [Qemu-devel] virtio-net: configurable TX queue size

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Wei Wang <wei.w.wang@intel.com>
Date: Wed, 10 May 2017 23:07:07 +0300

On Wed, May 10, 2017 at 05:52:23PM +0800, Wei Wang wrote:
> On 05/07/2017 12:39 PM, Wang, Wei W wrote:
> > On 05/06/2017 04:37 AM, Michael S. Tsirkin wrote:
> > > On Fri, May 05, 2017 at 10:27:13AM +0800, Jason Wang wrote:
> > > > 
> > > > On 2017年05月04日 18:58, Wang, Wei W wrote:
> > > > > Hi,
> > > > > 
> > > > > I want to re-open the discussion left long time ago:
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg06194.html
> > > > > , and discuss the possibility of changing the hardcoded (256) TX
> > > > > queue size to be configurable between 256 and 1024.
> > > > Yes, I think we probably need this.
> > > > 
> > > > > The reason to propose this request is that a severe issue of packet
> > > > > drops in TX direction was observed with the existing hardcoded 256
> > > > > queue size, which causes performance issues for packet drop
> > > > > sensitive guest applications that cannot use indirect descriptor
> > > > > tables. The issue goes away with 1K queue size.
> > > > Do we need even more, what if we find 1K is even not sufficient in the
> > > > future? Modern nics has size up to ~8192.
> > > > 
> > > > > The concern mentioned in the previous discussion (please check the
> > > > > link
> > > > > above) is that the number of chained descriptors would exceed
> > > > > UIO_MAXIOV (1024) supported by the Linux.
> > > > We could try to address this limitation but probably need a new
> > > > feature bit to allow more than UIO_MAXIOV sgs.
> > > I'd say we should split the queue size and the sg size.
> > > 
> 
> I'm still doing some investigation about this, one question (or issue) I
> found from the implementation is that the virtio-net device changes
> the message layout when the vnet_hdr needs an endianness swap
> (i.e. virtio_needs_swap()). This change adds one more iov to the
> iov[]-s passed from the driver.
> 
> To be more precise, the message from the driver could be in one
> of the two following layout:
> Layout1:
> iov[0]: vnet_hdr + data
> 
> Layout2:
> iov[0]: vnet_hdr
> iov[1]: data
> 
> If the driver passes the message in Layout1, and the following code
> from the device changes the message from Layout1 to Layout2:
> 
> if (n->needs_vnet_hdr_swap) {
>                 virtio_net_hdr_swap(vdev, (void *) &mhdr);
>                 sg2[0].iov_base = &mhdr;
>                 sg2[0].iov_len = n->guest_hdr_len;
>                 out_num = iov_copy(&sg2[1], ARRAY_SIZE(sg2) - 1,
>                                    out_sg, out_num,
>                                    n->guest_hdr_len, -1);
>                 if (out_num == VIRTQUEUE_MAX_SIZE) {
>                     goto drop;
>                 }
>                 out_num += 1;
>                 out_sg = sg2;
>             }
> 
> sg2[0] is the extra one, which potentially causes the off-by-one
> issue. I didn't find other possibilities that can cause the issue.
> 
> Could we keep the original layout by just copying the swapped
> "mhdr" back to original out_sg[0].iov_base?
> 
> Best,
> Wei

We can't because that data should be read-only by host.

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>

Follow-Ups:
- Re: [virtio-dev] RE: [Qemu-devel] virtio-net: configurable TX queue size
  - From: Wei Wang <wei.w.wang@intel.com>

References:
- virtio-net: configurable TX queue size
  - From: "Wang, Wei W" <wei.w.wang@intel.com>
- Re: [Qemu-devel] virtio-net: configurable TX queue size
  - From: Jason Wang <jasowang@redhat.com>
- Re: [Qemu-devel] virtio-net: configurable TX queue size
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- RE: [Qemu-devel] virtio-net: configurable TX queue size
  - From: "Wang, Wei W" <wei.w.wang@intel.com>
- Re: [virtio-dev] RE: [Qemu-devel] virtio-net: configurable TX queue size
  - From: Wei Wang <wei.w.wang@intel.com>