OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio] New virtio balloon...

"Michael S. Tsirkin" <mst@redhat.com> writes:
> Also copy virtio-dev since this in clearly implementation ...
> On Thu, Jan 30, 2014 at 07:34:30PM +1030, Rusty Russell wrote:
>> Hi,
>>         I tried to write a new balloon driver; it's completely untested
>> (as I need to write the device).  The protocol is basically two vqs, one
>> for the guest to send commands, one for the host to send commands.
>> Some interesting things come out:
>> 1) We do need to explicitly tell the host where the page is we want.
>>    This is required for compaction, for example.
>> 2) We need to be able to exceed the balloon target, especially for page
>>    migration.  Thus there's no mechanism for the device to refuse to
>>    give us the pages.
>> 3) The device can offer multiple page sizes, but the driver can only
>>    accept one.  I'm not sure if this is useful, as guests are either
>>    huge page backed or not, and returning sub-pages isn't useful.
>> Linux demo code follows.
>> Cheers,
>> Rusty.
> More comments:
> 	- for projects like auto-ballooning that Luiz works on,
> 	  it's not nice that to swap page 1 for page 2
> 	  you have to inflate then deflate
> 	  besides overhead this confuses the host:
> 	  imagine you tell QEMU to increase target,
> 	  meanwhile guest inflates temporarily,
> 	  QEMU thinks okay done, now you suddenly deflate.

I originally allowed the host to deny the deflate, which was why I
reversed it.  Then I realized that was a bad idea.  I can switch it back.

> 	- what's the status of page returned from balloon?
> 	  is it zeroed or can it have old data in there?
> 	  I think in practice Linux will sometimes map in a zero page,
> 	  so guest can save cycles and avoid zeroing it out.
> 	  I think we should tell this to guest when returning
> 	  pages.

QEMU may not know, since the kernel may not tell it.  We should assume
nothing, and let the guest zero if it needs to.  Seems like a premuture

> 	- I am guessing EXTRA_MEM is for uses like the ones proposed by
> 	  Frank Swiderski from google that inflate/deflate balloon
>           whenever guest wants (look for "Add a page cache-backed balloon
> 	  device driver").
>           this is useful but - we need to distinguish pages
> 	  like this from regular inflate.
> 	  it's not just counter and host needs a way to know
> 	  that it's target is reached

The driver needs to explicitly ask for pages in that region.

> 	- do we even want to allow guest not telling host when it wants
> 	  to reuse the page?
> 	  if yes, I think this should be per-page somehow: when balloon
> 	  is inflated guest should tell host whether it
> 	  expects to use this page.

I decided against it.  Making that optional got us into a mess, so now
it's compulsory.  That also fits better with the idea of a negative

> So I think we should accomodate these uses, and so we want the following flags:
> 	- WEAK_TARGET (that's the EXTRA_MEM but I think done in a better way)
>           flag that specifies pages do not count against target,
> 	  can be taken out of balloon.
> 	  EXTRA_MEM suggests there's an upper limit on balloon size
> 	  but IMHO that's just extra work for host: host does not care
> 	  I think, give it as much as you want.
> 	  set by guest, used by host

I think that Daniel really does want more memory than the guest starts
with.  And I think he still wants to use the balloon to control it.

> 	- TELL_HOST flag that specifies guest will tell host before using pages
> 	  at the moment, listed here for completeness)
> 	  set by guest, used by host


> 	  flag that specifies that page returned to guest
> 	  is zeroed
> 	  set by host, used by guest

I think that's silly.  Under Linux the guest doesn't need to know it's
zeroed or not, it just frees the page.

> Each of the flags can be just a feature flag, and then
> if we wants a mix of them host can create multiple
> balloon devices with differnet flags, and guest looks for best
> balloon for its purposes.
> Alternatively flags can be set and reported per page.
> A couple of other suggestions:
> - how to accomodate memory pressure in guest?
>   Let's add a field telling host how hard do we
>   want our memory back

That's very hard to define across guests.  Should we be using stats for
that instead?  In fact, should we allow gratuitous stats sending,
instead of a simple NEED_MEM flag?

> - assume you want to over-commit host and start
>   inflating balloon.
>   If low on memory it might be better for guest to
>   wait a bit before inflating.
>   Also, if host asks for a lot of memory a ton of
>   allocations will slow guest significantly.
>   But for guest to do the right thing we need host to tell guest what
>   are its memory and time contraints.
>   Let's add a field telling guest how hard do we
>   want it to give us memory (e.g. time limit)

We can't have intelligence at both ends, I think.  We've chosen a
host-led model, so we should stick to that unless someone has an
implementation which proves its worth doing otherwise.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]