virtio-comment message

Subject: Re: [virtio-comment] Re: [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Heng Qi <hengqi@linux.alibaba.com>
Date: Thu, 21 Dec 2023 02:08:05 -0500

On Thu, Dec 21, 2023 at 03:01:57PM +0800, Heng Qi wrote:
> 
> 
> å 2023/12/21 äå2:56, Michael S. Tsirkin åé:
> > On Wed, Dec 20, 2023 at 10:40:34PM +0800, Heng Qi wrote:
> > > Currently, when each time the driver attempts to update the coalescing parameters
> > > for a vq, it needs to kick the device and wait for the ctrlq response to return.
> > But there's no fundamental reason for it to do so.
> 
> Indeed, please look at the current upstream netdim code:
> 
> static void virtnet_rx_dim_work(struct work_struct *work)
> {
> ÂÂÂÂÂÂÂ struct dim *dim = container_of(work, struct dim, work);
> ÂÂÂÂÂÂÂ struct receive_queue *rq = container_of(dim,
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct receive_queue, dim);
> ÂÂÂÂÂÂÂ struct virtnet_info *vi = rq->vq->vdev->priv;
> ÂÂÂÂÂÂÂ struct net_device *dev = vi->dev;
> ÂÂÂÂÂÂÂ struct dim_cq_moder update_moder;
> ÂÂÂÂÂÂÂ int i, qnum, err;
> 
> ÂÂÂÂÂÂÂ if (!rtnl_trylock())
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return;
> 
> ÂÂÂÂÂÂÂ /* Each rxq's work is queued by "net_dim()->schedule_work()"
> ÂÂÂÂÂÂÂÂ * in response to NAPI traffic changes. Note that dim->profile_ix
> ÂÂÂÂÂÂÂÂ * for each rxq is updated prior to the queuing action.
> ÂÂÂÂÂÂÂÂ * So we only need to traverse and update profiles for all rxqs
> ÂÂÂÂÂÂÂÂ * in the work which is holding rtnl_lock.
> ÂÂÂÂÂÂÂÂ */
> ÂÂÂÂÂÂÂ for (i = 0; i < vi->curr_queue_pairs; i++) {ÂÂÂÂÂÂÂÂÂÂÂÂ
> <-----------------
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ rq = &vi->rq[i];
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dim = &rq->dim;
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ qnum = rq - vi->rq;
> 
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (!rq->dim_enabled)
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ continue;
> 
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ update_moder = net_dim_get_rx_moderation(dim->mode,
> dim->profile_ix);
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (update_moder.usec != rq->intr_coal.max_usecs ||
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ update_moder.pkts != rq->intr_coal.max_packets) {
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
> update_moder.usec,
> update_moder.pkts);
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (err)
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pr_debug("%s: Failed to send dim parameters
> on rxq%d\n",
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dev->name, qnum);
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dim->state = DIM_START_MEASURE;
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ }
> ÂÂÂÂÂÂÂ }
> 
> ÂÂÂÂÂÂÂ rtnl_unlock();
> }
> 
> > It can just as well submit multiple commands and then wait.
> 
> Sending multiple commands and then waiting reduces the number of kicks.
> But it does not reduce the number of device DMAs. I already responded to
> this in a thread to Jason.
> please check:
> 
> https://lists.oasis-open.org/archives/virtio-comment/202312/msg00142.html
> 
> "
> Overall, batch reqs are sufficient. Because the current major overhead is
> the number of DMAs.
> For example, for a device with 256 queues,
> 
> For the current upstream code, the overhead is 256 kicks + 256*8 DMA times.
> The overhead of batch cmds is 1 kick + 256*8 DMA times.
> The overhead of batch reqs is 1 kick + 8 DMA times.
> 
> Below is 8 DMA times:
> - get avail idx 1 time
> - Pull available ring information 1 time
> - Pull the desc pointed to by the avail ring 3 times
> - Pull the hdr and out bufs pointed to by avail ring desc 2 times
> - Write once to the status buf pointed to by status 1 time "
> 
> Thanks.

So there's more DMA but it's all slow path.
Why do we need to micro-optimize it?
What is the overhead practically, in milliseconds?

-- 
MST

Follow-Ups:
- Re: [virtio-comment] Re: [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs
  - From: Heng Qi <hengqi@linux.alibaba.com>

References:
- [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs
  - From: Heng Qi <hengqi@linux.alibaba.com>
- Re: [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-comment] Re: [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs
  - From: Heng Qi <hengqi@linux.alibaba.com>