OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH v3 6/8] admin: Add theory of operation for write recording commands


On Fri, Nov 17, 2023 at 09:57:52AM +0000, Parav Pandit wrote:
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Friday, November 17, 2023 3:21 PM
> > 
> > On Fri, Nov 17, 2023 at 09:41:40AM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Friday, November 17, 2023 3:08 PM
> > > >
> > > > On Fri, Nov 17, 2023 at 09:14:21AM +0000, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Sent: Friday, November 17, 2023 2:16 PM In any case you can
> > > > > > safely assume that many users will have migration that takes
> > > > > > seconds and minutes.
> > > > >
> > > > > Strange, but ok. I don't see any problem with current method.
> > > > > 8MB is used for very large VM of 1TB takes minutes. Should be fine.
> > > >
> > > > The problem is simple: vendors selling devices have no idea how
> > > > large the VM will be. So you have to over-provision for the max VM size.
> > > > If there was a way to instead allocate that in host memory, that
> > > > would improve on this.
> > >
> > > Not sure what to over provision for max VM size.
> > > Vendor does not know how many vcpus will be needed. It is no different
> > problem.
> > >
> > > When the VM migration is started, the individual tracking range is supplied by
> > the hypervisor to device.
> > > Device allocates necessary memory on this instruction.
> > >
> > > When the VM with certain size is provisioned, the member device can be
> > provisioned for the VM size.
> > > And if it cannot be provisioned, possibly this may not the right member device
> > to use at that point in time.
> > 
> > For someone who keeps arguing against adding single bit registers "because it
> > does not scale" you seem very nonchalant about adding 8Mbytes.
> > 
> There is fundamental difference on how/when a bit is used.
> One wants to use a bit for non-performance part and keep it always available vs data path.
> Not same comparison.
> 
> > I thought we have a nicely contained and orthogonal feature, so if it's optional
> > it's not a problem.
> It is optional as always.
> 
> > 
> > But with such costs and corner cases what exactly is the motivation for the
> > feature here?  
> New generations DPUs have memory for device data path workloads but not for bits.
> 
> > Do you have a PoC showing how this works better than e.g.
> > shadow VQ?
> > 
> Not yet.
> But I don't think this can be even a criteria to consider as dependency on PASID is nonstarter with other limitations.

You just need dirty bit in PTE, whether that is tied to PASID depends
very much on the platform.  For VTD I think it is.  And if shadow vq
works as a fallback, it just might be reasonable not to do any tracking
in virtio.

> > Maybe IOMMU based and shadow VQ based tracking are the way to go initially,
> > and if there's a problem then we should add this later, on top.
> >
> For the cpus that does not support IOMMU cannot shift to shadow VQ either.

I don't know what this means (no IOMMU at all?) but it looks like shadow
vq and similar approaches are in production with vdpa and have been
demonstrated for a while. All we are doing is supporting them in
virtio proper.

> > I really want us to finally make progress merging features and anything that
> > reduces scope initially is good for that.
> >
> Yes, if you prefer to split the last three patches, I am fine.
> Please let me know.

As here have not been any comments on 1-5 I don't think there's
need to repost this just yet. I'll review 1-5 next week.
I think in the next version it might be wise to split this and post
as two series, yes.

-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]