[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs
> From: Heng Qi <hengqi@linux.alibaba.com> > Sent: Wednesday, January 24, 2024 6:31 PM > > > å 2024/1/22 äå1:03, Parav Pandit åé: > > > >> From: Heng Qi <hengqi@linux.alibaba.com> > >> Sent: Monday, January 22, 2024 8:27 AM > >> > >> å 2024/1/20 äå5:59, Parav Pandit åé: > >>>> From: Heng Qi <hengqi@linux.alibaba.com> > >>>> Sent: Wednesday, January 17, 2024 10:22 AM > >>>> > >>>> å 2024/1/15 äå9:21, Parav Pandit åé: > >>>>>> From: virtio-comment@lists.oasis-open.org > >>>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi > >>>>>> Sent: Monday, January 15, 2024 6:36 PM > >>>>>> > >>>>>> Currently, when each time the driver attempts to update the > >>>>>> coalescing parameters for a vq, it needs to kick the device and > >>>>>> wait for the ctrlq response to return. > >>>>> It does not need to wait. This is some driver limitation that does > >>>>> not use > >>>> the queue as "queue". > >>>>> Such driver limitation should be removed in the driver. It does > >>>>> not qualify > >>>> as limitation. > >>>> > >>>> Yes, we don't have to wait. > >>>> > >>>> But in general, for user commands, it is necessary to obtain the > >>>> final results synchronously. > >>> Yes. Use initiated command can enqueue the request to cvq. Go to > >>> sleep > >> for several micro to milliseconds. > >>>> The user command cannot return before the final result is obtained. > >>>> And wait is not the problem this patch solves. > >>>> > >>> By not holding the rtnl lock, rest of the context that needs to > >>> enqueue the > >> request can progress such as that of netdim. > >> > >> Would like to see the using of rtnl lock changed. > >> > > Inside the virtnet_rx_dim_work() there should be rtnl lock call. > > A virtio_device level lock to be used for cvq. :) > > > >> In addition, I have made batching and asynchronousization of the > >> netdim command, you can refer to this patch: > >> https://lore.kernel.org/all/1705410693-118895-4-git-send-email- > >> hengqi@linux.alibaba.com/ > >> > > In the listed above driver patch the motivation "to optimize the CPU > > overhead of the DIM worker caused by the guest being busy waiting for > > the command response result." > > > > Is not right. > > Because guest is still busy waiting. > > There is no busy wait for guests, see get_cvq_work(). > Ok. not always busy waiting, sometimes it does. virtnet_cvq_response() should not have flag.. Who ever gets the OS global rtnl lock is calling virtnet_cvq_response() and checking and releasing. Shouldnât be done this way with try lock etc. Rtnl lock is not supposed to protect low level driver some ctrlvq response flag. > > Without batching, due to rtnl lock every VQ command is serialized as one > outstanding command at a time in virtnet_rx_dim_work(). > > Due to this device is unable to take benefit of DMA batching at large scale. > > Adding dim commands is now asynchronous, and the device will receive > batches of commands. > I am comparing the code where there is no rtnl lock with the code you posted. The reference code of patch [1] needs to begin without OS global rtnl lock as base line. [1] https://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/ > > > >>>>> This will enable driver to enqueue multiple cvq commands without > >>>>> waiting > >>>> for previous one. > >>>>> This will also enable device to find natural coalescing done on > >>>>> multiple > >>>> commands. > >>>> > >>>> When batch user commands occur, ensuring synchronization is a > concern. > >>>> > >>>>>> The following path is observed: 1. Driver kicks the device; 2. > >>>>>> After the device receives the kick, CPU scheduling occurs and DMA > >>>>>> multiple buffers multiple times; 3. The device completes > >>>>>> processing and replies > >>>> with a response. > >>>>>> When large-queue devices issue multiple requests and kick the > >>>>>> device frequently, this often interrupt the work of the device-side CPU. > >>>>> When there is large devices and multiple driver notifications by a > >>>>> cpu that is N times faster than the device side cpu, the device > >>>>> may find natural > >>>> coalescing on the commands of a given cvq. > >>>> > >>>> First we have to solve the ctrlq batch adding user (ethtool) command. > >>>> Even if processed in a batch way on device side, the number of > >>>> kicks and the number of backend DMAs has not been reduced. > >>> Driver notifications are PCI writes so it should not hamper device > >>> side, > >> which can ignore them when they do not bother about it. > >> > >> Driver notifications need to be processed by the DPU, which > >> interferes with the CPU on the DPU. > >> > > I was asking, if there is anyway to disable for your DPU to ignore these > notifications while previous one is pending? > > From your above description, it seems there isnât. > > While the device is processing the request, additional kicks are ignored. > Ok. that is good. In that case the kicks are not interfering, right?
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]