[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [PATCH v3 6/8] admin: Add theory of operation for write recording commands
> From: Jason Wang <jasowang@redhat.com> > Sent: Wednesday, November 8, 2023 9:59 AM > > On Tue, Nov 7, 2023 at 3:05âPM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Tue, Nov 07, 2023 at 12:04:29PM +0800, Jason Wang wrote: > > > > > > Each virtio and non virtio devices who wants to report their > > > > > > dirty page report, > > > > > will do their way. > > > > > > > > > > > > > 3) inventing it in the virtio layer will be deprecated in > > > > > > > the future for sure, as platform will provide much rich > > > > > > > features for logging e.g it can do it per PASID etc, I don't > > > > > > > see any reason virtio need to compete with the features that > > > > > > > will be provided by the platform > > > > > > Can you bring the cpu vendors and committement to virtio tc > > > > > > with timelines > > > > > so that virtio TC can omit? > > > > > > > > > > Why do we need to bring CPU vendors in the virtio TC? Virtio > > > > > needs to be built on top of transport or platform. There's no need to > duplicate their job. > > > > > Especially considering that virtio can't do better than them. > > > > > > > > > I wanted to see a strong commitment for the cpu vendors to support dirty > page tracking. > > > > > > The RFC of IOMMUFD support can go back to early 2022. Intel, AMD and > > > ARM are all supporting that now. > > > > > > > And the work seems to have started for some platforms. > > > > > > Let me quote from the above link: > > > > > > """ > > > Today, AMD Milan (or more recent) supports it while ARM SMMUv3.2 > > > alongside VT-D rev3.x also do support. > > > """ > > > > > > > Without such platform commitment, virtio also skipping it would not work. > > > > > > Is the above sufficient? I'm a little bit more familiar with vtd, > > > the hw feature has been there for years. > > > > > > Repeating myself - I'm not sure that will work well for all workloads. > > I think this comment applies to this proposal as well. > > > Definitely KVM did > > not scan PTEs. It used pagefaults with bit per page and later as VM > > size grew switched to PLM. This interface is analogous to PLM, > > I think you meant PML actually. And it doesn't work like PML. To behave like > PML it needs to > > 1) log buffers were organized as a queue with indices > 2) device needs to suspend (as a #vmexit in PML) if it runs out of the buffers > 3) device need to send a notification to the driver if it runs out of the buffer > > I don't see any of the above in this proposal. If we do that it would be less > problematic than what is being proposed here. > In this proposal, its slightly different than PML. The log buffer is a write record with the device. It keeps recording it. And owner driver queries the recorded pages. The device internally can do PML or other different implementations as it finds suitable. > Even if we manage to do that, it doesn't mean we won't have issues. > > 1) For many reasons it can neither see nor log via GPA, so this requires a > traversal of the vIOMMU mapping tables by the hypervisor afterwards, it would > be expensive and need synchronization with the guest modification of the IO > page table which looks very hard. > 2) There are a lot of special or reserved IOVA ranges (for example the interrupt > areas in x86) that need special care which is architectural and where it is > beyond the scope or knowledge of the virtio device but the platform IOMMU. > Things would be more complicated when SVA is enabled. And there could be > other architecte specific knowledge (e.g > PAGE_SIZE) that might be needed. There's no easy way to deal with those cases. > Current and future iommufd and OS interface likely can support this already. In current proposal, multiple ranges are supplied to the device, the reserved ranges are not part of it. > We wouldn't need to care about all of them if it is done at platform IOMMU > level. > I agree that when platform IOMMU has support and if its better it should be first priority to use by the hypervisor. Mainly because the D bit of the page already there, and not a special PML queue or a racy bitmap like what was proposed in other series.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]