OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH V2 6/6] virtio-pci: implement dirty page tracking




On 11/6/2023 6:22 PM, Michael S. Tsirkin wrote:
On Mon, Nov 06, 2023 at 12:06:39PM +0800, Zhu, Lingshan wrote:
Intel production work with similar bitmap
based dirty page tracking solution for years.
and then VMs became bigger and PML was introduced.
So you agree we should track dirty pages through the platform facilities?
I am glad to hear that!
I just said that I thought there's no PML in platform facilities and that
might be a problem. Am I wrong?
That's thru some platform may don't have PML.
The thing is most of platform have PML and this is a tradeoff:
1)don't implement dirty page tracking in virtio, but use platform. Then some platform
can not track dirty pages by HW
2)implement dirty page tracking in virtio, but most platform don't use it.

As you see, I have post this virito dirty page tracking as a backup here,
so this should be your call anyway.

Otherwise the device should report PFN which is not very practical.
Why not?
Really? the device report PFN?
What can happen if the device keep writing a small piece of memory???
then you just report the PFN once. Should work like PML really -
IOW devices maintains a bit per page internally and reports
PFN when bit is set.
Yes, it is a/d bit. by bitmap, when the device keep writing a small region of memory,
only needs to mark the bits as dirty once.

when reporting PFN, the device need to repeating report the same bunch of PFNs(0x1234abcd..),
which is not very efficient.

And we need to merge the device dirty page into QEMU dirty page bit map anyway.

      And the resolution is apparently 8 pages? You have just multiplied
      the migration bandwidth by a factor of 8.

No, as described in the comments, the tacking granularity is controlled by \
field{gra_power}, one bit represents a page with page_size = 2^(12 +
gra_power). This can also be used to reduce the size of the bitmap.
.. at the cost of increasing migration bandwidth.
The device is very likely to write a neighbor page,
how likely? and e.g. with slab randomization too? please collect some
data and show it.
DMA writes continuous memory and take an example of DMA writing the ring buffer,
it likely to write a neighbor page. This is called memory locality.

and this happens
everywhere for example CPU read 64 bytes aligned data.
CPUs don't need to send their cache across a bandwidth constrained
shared network.
This is not about cacheline size, just saying it is 64bytes aligned.

This is a tradeoff
tradeoff between which two options?
1) small tracking granularity and big bitmap
2) big tracking granularity and smaller bitmap(with memory locality)

"To prevent a read-modify-write procedure, if a memory page is dirty,
optionally the device is permitted to set the entire byte, which encompasses the relevant bit, to 1."

This is optional and DMA is very likely to write a neighbor page, and the device transmit a whole byte anyway
when a bit is dirty.

How about we use platform dirty page tracking facility then implement this in virtio, as Jason suggested?

Without something like PML it likely won't scale either.
So that would be platform issue which we don't need to take care of?
Intel VT-d can do this for sure.
Intel VT-d supports PML from the IOMMU? I didn't realize. Could you help
me find it in the doc please? Which hardware supports this in the field?
What about other vendors?
Please refer to intel vt-d spec: https://cdrdv2-public.intel.com/774206/vt-directed-io-spec%20.pdf

As far as I know, AMD and ARM support this too.

Anyway, as said above, this should be your call whether implement dirty page tracking in virtio.




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]