OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-dev] RE: [virtio-comment] RE: [virtio-dev] RE: [virtio-comment] [PATCH v2] virtio-net: support setting coalescing params for multiple vqs


> From: Heng Qi <hengqi@linux.alibaba.com>
> Sent: Wednesday, January 24, 2024 6:31 PM
> 
> 
> å 2024/1/22 äå1:03, Parav Pandit åé:
> >
> >> From: Heng Qi <hengqi@linux.alibaba.com>
> >> Sent: Monday, January 22, 2024 8:27 AM
> >>
> >> å 2024/1/20 äå5:59, Parav Pandit åé:
> >>>> From: Heng Qi <hengqi@linux.alibaba.com>
> >>>> Sent: Wednesday, January 17, 2024 10:22 AM
> >>>>
> >>>> å 2024/1/15 äå9:21, Parav Pandit åé:
> >>>>>> From: virtio-comment@lists.oasis-open.org
> >>>>>> <virtio-comment@lists.oasis- open.org> On Behalf Of Heng Qi
> >>>>>> Sent: Monday, January 15, 2024 6:36 PM
> >>>>>>
> >>>>>> Currently, when each time the driver attempts to update the
> >>>>>> coalescing parameters for a vq, it needs to kick the device and
> >>>>>> wait for the ctrlq response to return.
> >>>>> It does not need to wait. This is some driver limitation that does
> >>>>> not use
> >>>> the queue as "queue".
> >>>>> Such driver limitation should be removed in the driver. It does
> >>>>> not qualify
> >>>> as limitation.
> >>>>
> >>>> Yes, we don't have to wait.
> >>>>
> >>>> But in general, for user commands, it is necessary to obtain the
> >>>> final results synchronously.
> >>> Yes. Use initiated command can enqueue the request to cvq. Go to
> >>> sleep
> >> for several micro to milliseconds.
> >>>> The user command cannot return before the final result is obtained.
> >>>> And wait is not the problem this patch solves.
> >>>>
> >>> By not holding the rtnl lock, rest of the context that needs to
> >>> enqueue the
> >> request can progress such as that of netdim.
> >>
> >> Would like to see the using of rtnl lock changed.
> >>
> > Inside the virtnet_rx_dim_work() there should be rtnl lock call.
> > A virtio_device level lock to be used for cvq. :)
> >
> >> In addition, I have made batching and asynchronousization of the
> >> netdim command, you can refer to this patch:
> >> https://lore.kernel.org/all/1705410693-118895-4-git-send-email-
> >> hengqi@linux.alibaba.com/
> >>
> > In the listed above driver patch the motivation "to optimize the CPU
> > overhead of the DIM worker caused by the guest being busy waiting for
> > the command response result."
> >
> > Is not right.
> > Because guest is still busy waiting.
> 
> There is no busy wait for guests, see get_cvq_work().
> 
Ok. not always busy waiting, sometimes it does. virtnet_cvq_response() should not have flag..

Who ever gets the OS global rtnl lock is calling virtnet_cvq_response() and checking and releasing.
Shouldnât be done this way with try lock etc.
Rtnl lock is not supposed to protect low level driver some ctrlvq response flag.


> > Without batching, due to rtnl lock every VQ command is serialized as one
> outstanding command at a time in virtnet_rx_dim_work().
> > Due to this device is unable to take benefit of DMA batching at large scale.
> 
> Adding dim commands is now asynchronous, and the device will receive
> batches of commands.
> 
I am comparing the code where there is no rtnl lock with the code you posted.
The reference code of patch [1] needs to begin without OS global rtnl lock as base line.

[1] https://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/

> >
> >>>>> This will enable driver to enqueue multiple cvq commands without
> >>>>> waiting
> >>>> for previous one.
> >>>>> This will also enable device to find natural coalescing done on
> >>>>> multiple
> >>>> commands.
> >>>>
> >>>> When batch user commands occur, ensuring synchronization is a
> concern.
> >>>>
> >>>>>> The following path is observed: 1. Driver kicks the device; 2.
> >>>>>> After the device receives the kick, CPU scheduling occurs and DMA
> >>>>>> multiple buffers multiple times; 3. The device completes
> >>>>>> processing and replies
> >>>> with a response.
> >>>>>> When large-queue devices issue multiple requests and kick the
> >>>>>> device frequently, this often interrupt the work of the device-side CPU.
> >>>>> When there is large devices and multiple driver notifications by a
> >>>>> cpu that is N times faster than the device side cpu, the device
> >>>>> may find natural
> >>>> coalescing on the commands of a given cvq.
> >>>>
> >>>> First we have to solve the ctrlq batch adding user (ethtool) command.
> >>>> Even if processed in a batch way on device side, the number of
> >>>> kicks and the number of backend DMAs has not been reduced.
> >>> Driver notifications are PCI writes so it should not hamper device
> >>> side,
> >> which can ignore them when they do not bother about it.
> >>
> >> Driver notifications need to be processed by the DPU, which
> >> interferes with the CPU on the DPU.
> >>
> > I was asking, if there is anyway to disable for your DPU to ignore these
> notifications while previous one is pending?
> >  From your above description, it seems there isnât.
> 
> While the device is processing the request, additional kicks are ignored.
> 
Ok. that is good. In that case the kicks are not interfering, right?


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]