OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH v2 1/3] content: Document balloon feature free page hints


On Wed, May 20, 2020 at 1:24 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 19.05.20 23:00, Alexander Duyck wrote:
> > On Tue, May 19, 2020 at 9:09 AM David Hildenbrand <david@redhat.com> wrote:

[...]

> >>>>
> >>>> Let's think this through, what about this scenario:
> >>>>
> >>>> The device sets \field{free_page_hint_cmd_id} = X
> >>>> The driver starts reporting free pages (and reports all pages it has)
> >>>> 1. Sends X to start the windows
> >>>> 2. Sends all page hints (\field{free_page_hint_cmd_id} stays X)
> >>>> 3. Sends VIRTIO_BALLOON_CMD_ID_STOP to end the window
> >>>> The driver sets \field{free_page_hint_cmd_id} = DONE or STOP
> >>>>
> >>>> The guest can reuse the pages any time (triggered by the shrinker),
> >>>> especially, during 2, before the hypervisor even processed a hint
> >>>> request. It can happen that the guest reuses a page before the
> >>>> hypervisor processes the request and before
> >>>> \field{free_page_hint_cmd_id} changes.
> >>>>
> >>>> In QEMU, the double-bitmap magic makes sure that this is guaranteed to
> >>>> work IIRC.
> >>>>
> >>>> In that case, the page has to be migrated in that windows, the
> >>>> hypervisor must not modify the content.
> >>>
> >>> If by "reuse" you mean write to or reinitialize then that is correct.
> >>> All that is really happening is that any pages that are hinted have
> >>> the potential to be left behind with the original VM and not migrated
> >>> to the new one. We get the notification that the migration happened
> >>> when CMD_ID_DONE is passed to us. At that point the hinting is
> >>> complete and the device has no use for additional data.
> >>>
> >>> Instead of CMD_ID_STOP it probably would have made more sense to call
> >>> it something like CMD_ID_PAUSE or CMD_ID_HOLD as that is what it is
> >>> really doing. It is just temporarily holding the hints off while the
> >>> hypervisor synchronizes the dirty bits from the host.
> >>
> >> I think if migration fails, it will be left set to STOP. Guess we should
> >> specify that possibility somehow as well.
> >
> > Actually I think that is a bug in the QEMU implementation. We cannot
> > let that happen otherwise the guest is never going to let the memory
> > go. If we have to abort the migration we should be calling
> > virtio_balloon_free_page_done(), probably in response to something
> > like PRECOPY_NOTIFY_CLEANUP?
> >
> > The problem is I am not that familiar with the migration process
> > itself within QEMU so I am not sure which labels mean what in the
> > notifier, but we should be signaling DONE if we are either going to
> > abort the migration or if we completed it and are now on the new
> > system.
> >
>
> We are starting to see way too many issues in QEMU code that are hard to
> tackle.
>
> 1. The issue with hinted pages not being 0 or the old content when
> re-accessed, due to the way they are skipped during migration. MST
> called that a memory corruption and that this would be broken. Very hard
> to fix, maybe impossible.
>
> 2. Some cases (migration failing while we are only sending memory) not
> setting the status to DONE. The handling is just a mess.
>
> 3. Take a look at that asynchronous iothread handling.
> virtio_ballloon_get_free_page_hints(). I think you can trick it into
> doing nasty things while you are e.g., resetting the device. Also,
> virtio_ballloon_get_free_page_hints() will just busy-loop while waiting
> for requests. Just ugly. Also that block_iothread, just hard to grasp.
>
> Besides all the other issues I keep finding.
>
> At this point I'd just love to rip out the QEMU implement and let
> whoever wants to really have it try again. The Linux side seems to be in
> a better state.
>
> This feature is in an unmaintainable state in QEMU.
>
> @MST, is there something that speaks against ripping out the QEMU and
> Linux parts, documenting in the spec that this feature should not be used?

Have we mad a decision on this? I plan to resubmit my QEMU patch set
as I had a bug in patch 3 that needs to be addressed. I'm just
wondering if I should rebase on David's patches that fix free page
hinting, or if I should just replace the patch with one that removes
free page hinting?

Thanks.

- Alex


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]