OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] New virtio balloon...


Daniel Kiper <daniel.kiper@oracle.com> writes:
> On Tue, Feb 04, 2014 at 12:48:27PM +1030, Rusty Russell wrote:
>> Daniel Kiper <daniel.kiper@oracle.com> writes:
>> > On Thu, Jan 30, 2014 at 07:34:30PM +1030, Rusty Russell wrote:
>> >> Hi,
>> >>
>> >>         I tried to write a new balloon driver; it's completely untested
>> >> (as I need to write the device).  The protocol is basically two vqs, one
>> >> for the guest to send commands, one for the host to send commands.
>> >>
>> >> Some interesting things come out:
>> >> 1) We do need to explicitly tell the host where the page is we want.
>> >>    This is required for compaction, for example.
>> >>
>> >> 2) We need to be able to exceed the balloon target, especially for page
>> >>    migration.  Thus there's no mechanism for the device to refuse to
>> >>    give us the pages.
>> >
>> > Admin should have a way to impose a mem limit on guest. However,
>> > he/she should be able to change it in any direction (up and down) and
>> > even increase it above limit established at guest boot (needed for
>> > memory hotplug). On the other hand guest should not be able to allocate
>> > more memory then it was requested by admin in a given time.
>>
>> Well, now we have VIRTIO_BALLOON_GCMD_EXCHANGE_PAGES, one problem with a
>> strict limit is gone.  We still have the problem of a race of the host
>> lowering the target while the guest makes a request for more pages, but
>> perhaps we just allow a single such request?
>
> I do not see a lot of problem here if balloon is host-led. Host just
> does his job. Guest just gently asks but host is not forced to fulfill
> requests. However, we should consider direct rejects which I mentioned
> in earlier emails.

It's fairly easy to implement reject VIRTIO_BALLOON_GCMD_GET_PAGES.  We
could just have the device write the pages it is giving back to the
array.  The virtio protocol tells us how many bytes the device has
written.

>> >> 3) The device can offer multiple page sizes, but the driver can only
>> >>    accept one.  I'm not sure if this is useful, as guests are either
>> >>    huge page backed or not, and returning sub-pages isn't useful.
>> >
>> > Hmmm... I suppose that even if guest is backed by huge pages then internaly
>> > it uses standard page sizes (if not directed otherwise). So we have a
>> > problem here because I do not know what to do if guest backed by 1 GiB
>> > pages would like to inflate balloon with 4 KiB pages. Should we refuse
>> > that?
>>
>> Two choices: offer 1G pages to the guest.  If it can't handle that, it's
>> pretty useless anyway (and will fail initialization).  Otherwise, offer
>> both 1G and 4k pages, and it might accept 4k pages (you'd do this if you
>> have the ability to split 1G pages into 4k pages, I guess).
>
> Both make sens. However, later, if it is possible, could make a performance hit.
> Additionally, it looks that in Linux case hugepages are created at boot time
> and could not be split into smaller chunks. Am I missing something?

Transparent huge pages will be assigned and split on demand, though I'm
completely ignorant of how that works with KVM.

>> >> +/* define the balloon_mapping->a_ops callback to allow balloon page migration */
>> >> +static const struct address_space_operations virtio_balloon_aops = {
>> >> +			.migratepage = virtballoon_migratepage,
>> >> +};
>> >> +#endif /* CONFIG_BALLOON_COMPACTION */
>> >
>> > Do we really need this feature on guest?
>>
>> Well, it's really a Linux-specific thing, but yes, if you can't migrate
>> pages then page compation really suffers.  Rafael Aquini
>> <aquini@redhat.com> added this.
>
> For what we need high order pages in guests?

Huge pages in the guest is as much of a win as it is in the host.
Fortunately we don't have drivers requiring huge pages, but userspace
will want them.

>> >> +/* This means the balloon can go negative (ie. add memory to system) */
>> >> +#define VIRTIO_BALLOON_F_EXTRA_MEM	0
>> >> +
>> >> +struct virtio_balloon_config_space {
>> >> +	/* Set by device: bits indicate what page sizes supported. */
>> >> +	__le64 pagesizes;
>> >> +	/* Set by driver: only a single bit is set! */
>> >> +	__le64 page_size;
>> >> +
>> >> +	/* These set by device if VIRTIO_BALLOON_F_EXTRA_MEM. */
>> >> +	__le64 extra_mem_start;
>> >> +	__le64 extra_mem_end;
>> >
>> > This cannot be a part of config space. Guest should be able to hotplug
>> > memory many times. Hence it should be a part of reply from host.
>>
>> This was to specify the upper limits of where the extra mem is.  It was
>
> If yes then it should be highest available address in a given guest architecture.
>
>> intended to represent one or more section sizes.
>
> Section size is needed for a host only when we assume that the host
> should establish hoplugged memory placement. Otherwise the host does
> not need it.

Indeed, but I thought we wanted the host to specify the region which
could be used.

>> > Additionally,
>> > we should remember that memory is hotplugged in chunks known as section size.
>> > They are usually quite big and architecture depended (e.g. IIRC it is 128 MiB
>> > on x86_64). So maybe guest should tell host about its supported section size
>> > in config space. However, there should not be a requirement that target must
>> > be equal to multiple of section size in case of memory hotplug. It should be
>> > set as needed and balloon driver should reserve relevant memory region told
>> > by host (it should be rounded by host up to nearest multiple of section size)
>> > and later back relevant pages with PFNs up to a given target.
>>
>> We could either have a special request (VIRTIO_BALLOON_GCMD_NEW_MEM)
>> where the guest specifies where it wants another chunk of memory.  Then
>> after that, it can ask for those pages' PFNs in
>> VIRTIO_BALLOON_GCMD_GET_PAGES.
>>
>> Or we could simply allow a guest to request (if the
>> VIRTIO_BALLOON_F_EXTRA_MEM feature is negotiated) any PFN it wants, and
>> let it handle its sections itself.
>>
>> The latter is simpler, but is it sufficient?
>
> Right, but we have some issues with later solution in Xen. Currently
> I think that the host should establish region for hotplugged memory
> because it constructed memory map at boot time. However, I am not sure
> that it has always knowledge about e.g. all IO regions and similar stuff.
> On the other hand guest after boot may not have access to boot memory map.
> Hmmm... I still think that the host should establish hoplugged memory placement.
> Am I wrong?

I don't know.  I tend to agree that it makes sense for the host to
establish the hotplug memory region.  I assume it would set this up
front, and then the guest would request PFNs in that range.

I'm not sure if we will know until implementations exist.  This will not
be established before then, so perhaps we should add this as an extra
feature after v1.0?

Cheers,
Rusty.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]