OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH 0/2] Selective queue enabling


On Fri, Jun 09, 2023 at 12:27:58PM +0200, Eugenio Perez Martin wrote:
> On Fri, Jun 9, 2023 at 12:08âAM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Jun 08, 2023 at 10:36:19AM +0200, Eugenio Perez Martin wrote:
> > > On Thu, Jun 8, 2023 at 9:19âAM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Thu, Jun 08, 2023 at 08:43:18AM +0200, Eugenio Perez Martin wrote:
> > > > > On Thu, Jun 8, 2023 at 8:04âAM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jun 08, 2023 at 08:44:41AM +0800, Jason Wang wrote:
> > > > > > > On Thu, Jun 8, 2023 at 4:27âAM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jun 07, 2023 at 11:41:39AM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > On Wed, Jun 7, 2023 at 10:59âAM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, Jun 07, 2023 at 10:47:12AM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > > > On Wed, Jun 7, 2023 at 10:23âAM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 7 Jun 2023 07:35:58 +0200, Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > > > > > > > > > > > > On Tue, Jun 6, 2023 at 9:10âPM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Jun 06, 2023 at 07:55:09PM +0200, Eugenio PÃrez wrote:
> > > > > > > > > > > > > > > This series allows the driver to start the device (as set DRIVER_OK) with only
> > > > > > > > > > > > > > > some queues enabled, and then enable another queues later.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This is the current way to migrate net device state through control
> > > > > > > > > > > > > > > virtqueue, in a software assisted framework with vDPA:
> > > > > > > > > > > > > > > * First, only net CVQ is enabled at DRIVER_OK
> > > > > > > > > > > > > > > * All the control commands (mac address, mq, etc) needed for the device
> > > > > > > > > > > > > > > to behave the same as the source of migration are sent
> > > > > > > > > > > > > > > * Finally all the dataplane queues are enabled.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In my opinion, this is somewhat problematic. Specifically, currently
> > > > > > > > > > > > > > devices tend to deduce how many queues are needed by looking
> > > > > > > > > > > > > > at the state at DRIVER_OK time.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Question: what is wrong with enabling queues initially and then
> > > > > > > > > > > > > > doing a reset right after DRIVER_OK? You can even allocate
> > > > > > > > > > > > > > memory for just one queue (zeroing it out).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Granted this looks kind of ugly but side-steps this problem with
> > > > > > > > > > > > > > no need for spec changes.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The problem is that the rx queues can start receiving, as the guest
> > > > > > > > > > > > > already has buffers there.
> > > > > > > > > > > >
> > > > > > > > > > > > Can we reset the vq before filling buffers to it?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > They are passthrough from the guest so there is a window where the
> > > > > > > > > > > device can process rx descriptors.
> > > > > > > > > >
> > > > > > > > > > Not if there are no descriptors there.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > But the migration is driven by the hypervisor, and it cannot control
> > > > > > > > > that. The guest will likely have rx descriptors available.
> > > > > > > >
> > > > > > > > Maybe I misunderstand. Is hypervisor driving cvq while guest is driving
> > > > > > > > rx queues?
> > > > > > >
> > > > > > > No, hypervisor tries to restore virtqueue states via cvq before guest can drive.
> > > > > >
> > > > > > So cvq maps to hypervisor memory?
> > > > > >
> > > > >
> > > > > >From the device POV, yes, the CVQ vring is in hypervisor memory, not
> > > > > in the guest's one. That allows the hypervisor to send the CVQ
> > > > > commands to restore the device state in the destination, without the
> > > > > guest intervention.
> > > > >
> > > > > Data vqs, on the other hand, are passthrough. The device talks
> > > > > directly to the guest's vrings.
> > > > >
> > > > > Currently, this is done by emulating CVQ in the host's kernel and then
> > > > > translating the commands in the way that suits better the vdpa device,
> > > > > using its vendor vdpa driver in the host. Other methods like PASID are
> > > > > possible too.
> > > > >
> > > > > Thanks!
> > > >
> > > > OK. So my suggestion is simple: map data vrings to a zero page in
> > > > hypervisor memory initially. Later reset and map to guest.
> > > >
> > >
> > > The idea is interesting, but we lose the net configuration in a device
> > > reset, so we need to send it again.
> >
> > Ring reset, not device reset.
> >
> > > In the case of qemu+vDPA maybe it is possible with queue_reset, like:
> > > * Map all the guest pages as usual.
> > > * Map a new zero page, forbid the guest to write on that page. Even in
> > > vIOMMU case, we can send all the CVQ commands before allowing the
> > > guest to modify mappings. The guest has no way to write on that page
> > > through the device as, well, dataplane is not initialized. It's the
> > > way DPDK shadow virtqueues worked, so it should be valid.
> > > * Reset the queues to the guest passthrough.
> > >
> > > I don't like the complexity of it but I like it does require even less
> > > changes to the device / spec.
> >
> > This is exactly what I meant.
> >
> 
> Ok, sorry for the misunderstanding.
> 
> Another drawback of that is the long time a device can spend to
> reconfigure its memory maps.

The map is already used for cvq though, I don't see why it would take
much more with this.

> taking that into account, does it make
> sense to send a new version with the bit 1 on enable?
> 
> Thanks!

Current hardware won't support the extra bit though.

> > > >
> > > > If that does not work, then I am not sure this proposal is enough
> > > > since I think devices want to have a specific point in time
> > > > where they know which queues are going to be used.
> > >
> > > In the case of net, this should not be a problem since the spec
> > > mandates 2 if !cvq, 3 if cvq but !mq, and max_virtqueue_pairs*2+1 if
> > > cvq and mq. virtio-blk also has num_queues.
> >
> > This is max, not how many there are in practice.
> >
> > > Is it even valid to not enable some of the queues?
> >
> > Yes and linux will do that if max_virtqueue_pairs > #CPUs.
> >
> > > I've always felt that queue_enable has been redundant before
> > > queue_reset for this reason actually.
> > >
> > > > Maybe we could use e.g. bit 1 in queue_enable to signal that?
> > > >
> > >
> > > I'm totally ok to go in that direction.
> > >
> > > Thanks!
> > >
> > > >
> > > > > >
> > > > > >
> > > > > > > So in this case if RX queues are enabled at the same time, the device
> > > > > > > may try to queue packets to queue 0.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > > How do you do this - they are DMA from same VF no?
> > > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > > Apart from that, the back and forth
> > > > > > > > > > > > > introduces latencies.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Maybe a better angle is to start all the queues as if they're reset,
> > > > > > > > > > > > > write 1 just to CVQ, configure the device, and then write 1 to all
> > > > > > > > > > > > > dataplane vqs?
> > > > > > > > > > > >
> > > > > > > > > > > > write to what?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Sorry I was unclear, I mean to enable the vqs writing 1 to queue_enable.
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > >
> > > > > > > > > > > > Thanks.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > Eugenio PÃrez (2):
> > > > > > > > > > > > > > >   virtio: introduce selective queue enabling
> > > > > > > > > > > > > > >   virtio: pci support virtqueue selective enabling
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >  content.tex       | 15 +++++++++++++--
> > > > > > > > > > > > > > >  transport-pci.tex |  4 ++++
> > > > > > > > > > > > > > >  2 files changed, 17 insertions(+), 2 deletions(-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > 2.31.1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > This publicly archived list offers a means to provide input to the
> > > > > > > > > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > > > > > > > >
> > > > > > > > > > > > In order to verify user consent to the Feedback License terms and
> > > > > > > > > > > > to minimize spam in the list archive, subscription is required
> > > > > > > > > > > > before posting.
> > > > > > > > > > > >
> > > > > > > > > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > > > > > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > > > > > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > > > > > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > > > > > > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > > > > > > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > > > > > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > > > > > > > > Join OASIS: https://www.oasis-open.org/join/
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]