OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] New virtio balloon...


On Thu, Jan 30, 2014 at 07:34:30PM +1030, Rusty Russell wrote:
> Hi,
>
>         I tried to write a new balloon driver; it's completely untested
> (as I need to write the device).  The protocol is basically two vqs, one
> for the guest to send commands, one for the host to send commands.
>
> Some interesting things come out:
> 1) We do need to explicitly tell the host where the page is we want.
>    This is required for compaction, for example.
>
> 2) We need to be able to exceed the balloon target, especially for page
>    migration.  Thus there's no mechanism for the device to refuse to
>    give us the pages.

Admin should have a way to impose a mem limit on guest. However,
he/she should be able to change it in any direction (up and down) and
even increase it above limit established at guest boot (needed for
memory hotplug). On the other hand guest should not be able to allocate
more memory then it was requested by admin in a given time.

> 3) The device can offer multiple page sizes, but the driver can only
>    accept one.  I'm not sure if this is useful, as guests are either
>    huge page backed or not, and returning sub-pages isn't useful.

Hmmm... I suppose that even if guest is backed by huge pages then internaly
it uses standard page sizes (if not directed otherwise). So we have a
problem here because I do not know what to do if guest backed by 1 GiB
pages would like to inflate balloon with 4 KiB pages. Should we refuse that?

> Linux demo code follows.
>
> Cheers,
> Rusty.
>
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 9076635697bb..1dd45691b618 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,4 +1,4 @@
>  obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
>  obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
>  obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
> -obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> +obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o virtio_balloon2.o
> diff --git a/drivers/virtio/virtio_balloon2.c b/drivers/virtio/virtio_balloon2.c
> new file mode 100644
> index 000000000000..93f13e7c561d
> --- /dev/null
> +++ b/drivers/virtio/virtio_balloon2.c
> +static const struct address_space_operations virtio_balloon_aops;

[...]

> +#ifdef CONFIG_BALLOON_COMPACTION
> +/*
> + * virtballoon_migratepage - perform the balloon page migration on behalf of
> + *			     a compation thread.     (called under page lock)
                               ^^^^^^^^^ -> compaction

> + * @mapping: the page->mapping which will be assigned to the new migrated page.
> + * @newpage: page that will replace the isolated page after migration finishes.
> + * @page   : the isolated (old) page that is about to be migrated to newpage.
> + * @mode   : compaction mode -- not used for balloon page migration.
> + *
> + * After a ballooned page gets isolated by compaction procedures, this is the
> + * function that performs the page migration on behalf of a compaction thread
> + * The page migration for virtio balloon is done in a simple swap fashion which
> + * follows these two macro steps:
> + *  1) insert newpage into vb->pages list and update the host about it;
> + *  2) update the host about the old page removed from vb->pages list;
> + *
> + * This function preforms the balloon page migration task.
> + * Called through balloon_mapping->a_ops->migratepage
> + */
> +static int virtballoon_migratepage(struct address_space *mapping,
> +		struct page *newpage, struct page *page, enum migrate_mode mode)
> +{
> +	struct balloon_dev_info *vb_dev_info = balloon_page_device(page);
> +	struct virtio_balloon *vb;
> +	unsigned long flags;
> +	int err;
> +
> +	BUG_ON(!vb_dev_info);
> +
> +	vb = vb_dev_info->balloon_device;
> +
> +	/*
> +	 * In order to avoid lock contention while migrating pages concurrently
> +	 * to leak_balloon() or fill_balloon() we just give up the balloon_lock
> +	 * this turn, as it is easier to retry the page migration later.
> +	 * This also prevents fill_balloon() getting stuck into a mutex
> +	 * recursion in the case it ends up triggering memory compaction
> +	 * while it is attempting to inflate the ballon.
> +	 */
> +	if (!mutex_trylock(&vb->lock))
> +		return -EAGAIN;
> +
> +	/* Try to get the page out of the balloon. */
> +	vb->gcmd.get_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GET_PAGES);
> +	vb->gcmd.get_pages.pages[0] = page_to_pfn(page) << PAGE_SHIFT;
> +	if (!send_gcmd(vb, offsetof(struct gcmd_get_pages, pages[1]))) {
> +		err = -EIO;
> +		goto unlock;
> +	}
> +
> +	/* Now put newpage into balloon. */
> +	vb->gcmd.give_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GIVE_PAGES);
> +	vb->gcmd.give_pages.pages[0] = page_to_pfn(newpage) << PAGE_SHIFT;
> +	if (!send_gcmd(vb, offsetof(struct gcmd_give_pages, pages[1]))) {
> +		/* We leak a page here, but only happens if balloon broken. */
> +		err = -EIO;
> +		goto unlock;
> +	}
> +
> +	spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
> +	balloon_page_insert(newpage, mapping, &vb_dev_info->pages);
> +	vb_dev_info->isolated_pages--;
> +	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
> +
> +	/*
> +	 * It's safe to delete page->lru here because this page is at
> +	 * an isolated migration list, and this step is expected to happen here
> +	 */
> +	balloon_page_delete(page);
> +	err = MIGRATEPAGE_BALLOON_SUCCESS;
> +
> +unlock:
> +	mutex_unlock(&vb->lock);
> +	return err;
> +}
> +
> +/* define the balloon_mapping->a_ops callback to allow balloon page migration */
> +static const struct address_space_operations virtio_balloon_aops = {
> +			.migratepage = virtballoon_migratepage,
> +};
> +#endif /* CONFIG_BALLOON_COMPACTION */

Do we really need this feature on guest?

[...]

> +/* This means the balloon can go negative (ie. add memory to system) */
> +#define VIRTIO_BALLOON_F_EXTRA_MEM	0
> +
> +struct virtio_balloon_config_space {
> +	/* Set by device: bits indicate what page sizes supported. */
> +	__le64 pagesizes;
> +	/* Set by driver: only a single bit is set! */
> +	__le64 page_size;
> +
> +	/* These set by device if VIRTIO_BALLOON_F_EXTRA_MEM. */
> +	__le64 extra_mem_start;
> +	__le64 extra_mem_end;

This cannot be a part of config space. Guest should be able to hotplug
memory many times. Hence it should be a part of reply from host. Additionally,
we should remember that memory is hotplugged in chunks known as section size.
They are usually quite big and architecture depended (e.g. IIRC it is 128 MiB
on x86_64). So maybe guest should tell host about its supported section size
in config space. However, there should not be a requirement that target must
be equal to multiple of section size in case of memory hotplug. It should be
set as needed and balloon driver should reserve relevant memory region told
by host (it should be rounded by host up to nearest multiple of section size)
and later back relevant pages with PFNs up to a given target.

Daniel


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]