OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v3 6/8] admin: Add theory of operation for write recording commands


On Thu, Nov 16, 2023 at 1:37âAM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Monday, November 13, 2023 9:11 AM
> >
> > On Fri, Nov 10, 2023 at 2:46âPM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > > Hi Michael,
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Thursday, November 9, 2023 1:29 PM
> > >
> > > [..]
> > > > > Besides the issue of performance, it's also racy, assuming we are
> > > > > logging
> > > > IOVA.
> > > > >
> > > > > 0) device log IOVA
> > > > > 1) hypervisor fetches IOVA from log buffer
> > > > > 2) guest map IOVA to a new GPA
> > > > > 3) hypervisor traverse guest table to get IOVA to new GPA
> > > > >
> > > > > Then we lost the old GPA.
> > > >
> > > > Interesting and a good point. And by the way e.g. vhost has the same
> > > > issue.  You need to flush dirty tracking info when changing the
> > > > mappings somehow.  Parav what's the plan for this? Should be addressed in
> > the spec too.
> > > >
> > > As you listed the flush is needed for vhost or device-based DPT.
> >
> > What does DPT mean? Device Page Table? Let's not invent terminology which is
> > not known by others please.
> >
> Sorry for using the acronym. I meant dirty page tracking.
>
> > We have discussed it many times. You can't just depend on ATS or reinventing
> > wheels in virtio.
> The dependency is on the iommu which would have the mapping of GIOVA to GPA like any sw implementation.
> No dependency on ATS.
>
> >
> > What's more, please try not to give me the impression that the proposal is
> > optimized for a specific vendor (like device IOMMU stuffs).
> >
> You should stop calling this specific vendor thing.

Well, as you have explained, the confusion came from "DPT" ...

> One can equally say that suspend bit proposal is for the sw_vendor device who is forcing virtio hw device to only implement ioqueues + PASID + non_unified interface for PF, VF, SIOVs + non_TDISP based devices.
>
> > > The necessary plumbing is already covered for this in the query (read and
> > clear) command of this v3 proposal.
> >
> > The issue is logging via IOVA ... I don't see how "read and clear" can help.
> >
> Read and clear helps that ensures that all the dirty pages are reported, hence there is no mapping/unmapping race.

Reported as IOVA ...

> As everything is reported.
>
> > > It is listed in Device Write Records Read Command.
> >
> > Please explain how your proposal can solve the above race.
> >
> In below manner.
> 1. guest has GIOVA to GPA_1 mapping
> 2. RX packets occurred to GIOVA
> 3. device reported dirty page log for GIOVA (hypervisor is yet to read)
> 4. guest requested mapping change from GIOVA to GPA_2
> 4.1 During this IOTLB is invalidated and dirty page report is queried ensuring, it can change the mapping

It requires

1) hypervisor traps IOTLB invalidation, which doesn't work when
nesting could be offloaded (IOMMUFD has started the work to support
nesting)
2) query the device about the dirty page on each IOTLB invalidation which:
2.1) A huge round trip: guest IOTLB invalidation -> trapped by
hypervisor -> start the query from the device -> device return ->
hypervisor reports IOTLB invalidation is done -> let guest run. Have
you benchmarked the RTT in this case? There are just too many places
that cause the delay in the middle.
2.2) Guest triggerable behaviour, malicious guest can simply do
endless IOTLB invalidation to DOS the e.g admin virtqueue

>
> > >
> > > When the page write record is fully read, it is flushed.
> > > How/when to use, I think its hypervisor specific, so we probably better off not
> > documenting those details.
> >
> > Well, as the author of this proposal, at least you need to know how a hypervisor
> > can work with your proposal, no?
> >
> Likely yes, but it is not the scope of the spec to list those paths etc.

Fine, but as a reviewer I need to know if it can work with a hypervisor well.

>
> > > May be such read is needed in some other path too depending on how
> > hypervisor implemented.
> >
> > What do you mean by "May be ... some other path" here? You're inventing a
> > mechanism that you don't know how a hypervisor can use?
>
> No. I meant hypervisor may have more operations that map/unmap/flush where it may need to implement it.
> Some one may call it set_map(), some may say dma_map()...

Ok.

Thanks



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]