OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Dirty Page Tracking (DPT)


On Fri, Mar 06, 2020 at 10:40:13AM -0500, Rob Miller wrote:
> I understand that DPT isn't really on the forefront of the vDPA framework, but
> wanted to understand if there any initial thoughts on how this would work...

And judging by the next few chapters, you are actually
talking about vhost pci, right?

> In the migration framework, in its simplest form, (I gather) its QEMU via KVM
> that is reading the dirty page table, converting bits to page numbers, then
> flushing remote VM/copying local page(s)->remote VM, ect. 
> 
> While this is fine for a VM (say VM1) dirtying its own memory and the accesses
> are trapped in the kernel as well as the log is being updated, I'm not sure
> what happens in the situation of vhost, where a remote VM (say VM2) is dirtying
> up VM1's memory since it can directly access it, during packet reception for
> example.
> Whatever technique is employed to catch this, how would this differ from a HW
> based Virtio device doing DMA directly into a VM's DDR, wrt to DPT? Is QEMU
> going to have a 2nd place to query the dirty logs - ie: the vDPA layer?

I don't think anyone has a good handle at the vhost pci migration yet.
But I think a reasonable way to handle that would be to
activate dirty tracking in VM2's QEMU.

And then VM2's QEMU would periodically copy the bits to the log - does
this sound right?

> Further I heard about a SW based DPT within the vDPA framework for those
> devices that do not (yet) support DPT inherently in HW. How is this envisioned
> to work?

What I am aware of is simply switching to a software virtio
for the duration of migration. The software can be pretty simple
since the formats match: just copy available entries to device ring,
and for used entries, see a used ring entry, mark page
dirty and then copy used entry to guest ring.


Another approach that I proposed and was prototyped at some point by
Alex Duyck is guest driver touching the page in question before
processing it within guest e.g. by an atomic xor with 0.
Sounds attractive but didn't perform all that well.


> Finally, for those HW vendors that do support DPT in HW, a mapping of a bit ->
> page isn't really an option, since no one wants to do a byte wide
> read-modify-write across the PCI bus, but rather  map a whole byte to page is
> likely more desirable - the HW can just do non-posted writes to the dirty page
> table. If byte wise, then the QEMU/vDPA layer has to either fix-up the mapping
> (from byte->bit) or have the capability to handle the granularity diffs.
> 
> Thoughts?
> 
> Rob Miller
> rob.miller@broadcom.com
> (919)721-3339

If using an IOMMU, DPT can also be done using either PRI or dirty bit in
a PTE. PRI is an interrupt so it can kick off a thread to set bits in
the log I guess, but if it's the dirty bit then I don't think there's an
interrupt. And a polling thread does not sound attractive.  I guess
we'll need a new interface to notify VDPA that QEMU is looking for dirty
logs, and then VDPA can send them to QEMU in some way.  Will probably be
good enough to support vendor specific logging interfaces, too.  I don't
actually have hardware which supports either so actually coding it up is
not yet practical.

Further, at my KVM forum presentaiton I proposed a virtio-specific
pagefault handling interface.  If there's a wish to standardize and
implement that, let me know and I will try to write this up in a more
formal way.


-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]