OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] New virtio balloon...


Also copy virtio-dev since this in clearly implementation ...

On Thu, Jan 30, 2014 at 07:34:30PM +1030, Rusty Russell wrote:
> Hi,
> 
>         I tried to write a new balloon driver; it's completely untested
> (as I need to write the device).  The protocol is basically two vqs, one
> for the guest to send commands, one for the host to send commands.
> 
> Some interesting things come out:
> 1) We do need to explicitly tell the host where the page is we want.
>    This is required for compaction, for example.
> 
> 2) We need to be able to exceed the balloon target, especially for page
>    migration.  Thus there's no mechanism for the device to refuse to
>    give us the pages.
> 
> 3) The device can offer multiple page sizes, but the driver can only
>    accept one.  I'm not sure if this is useful, as guests are either
>    huge page backed or not, and returning sub-pages isn't useful.
> 
> Linux demo code follows.
> 
> Cheers,
> Rusty.

More comments:
	- for projects like auto-ballooning that Luiz works on,
	  it's not nice that to swap page 1 for page 2
	  you have to inflate then deflate
	  besides overhead this confuses the host:
	  imagine you tell QEMU to increase target,
	  meanwhile guest inflates temporarily,
	  QEMU thinks okay done, now you suddenly deflate.


	- what's the status of page returned from balloon?
	  is it zeroed or can it have old data in there?
	  I think in practice Linux will sometimes map in a zero page,
	  so guest can save cycles and avoid zeroing it out.
	  I think we should tell this to guest when returning
	  pages.


	- I am guessing EXTRA_MEM is for uses like the ones proposed by
	  Frank Swiderski from google that inflate/deflate balloon
          whenever guest wants (look for "Add a page cache-backed balloon
	  device driver").

          this is useful but - we need to distinguish pages
	  like this from regular inflate.
	  it's not just counter and host needs a way to know
	  that it's target is reached


	- do we even want to allow guest not telling host when it wants
	  to reuse the page?
	  if yes, I think this should be per-page somehow: when balloon
	  is inflated guest should tell host whether it
	  expects to use this page.


So I think we should accomodate these uses, and so we want the following flags:

	- WEAK_TARGET (that's the EXTRA_MEM but I think done in a better way)
          flag that specifies pages do not count against target,
	  can be taken out of balloon.
	  EXTRA_MEM suggests there's an upper limit on balloon size
	  but IMHO that's just extra work for host: host does not care
	  I think, give it as much as you want.
	  set by guest, used by host

	- TELL_HOST flag that specifies guest will tell host before using pages
	  (that's VIRTIO_BALLOON_F_MUST_TELL_HOST
	  at the moment, listed here for completeness)
	  set by guest, used by host

	- ZEROED
	  flag that specifies that page returned to guest
	  is zeroed
	  set by host, used by guest



Each of the flags can be just a feature flag, and then
if we wants a mix of them host can create multiple
balloon devices with differnet flags, and guest looks for best
balloon for its purposes.

Alternatively flags can be set and reported per page.


A couple of other suggestions:

- how to accomodate memory pressure in guest?
  Let's add a field telling host how hard do we
  want our memory back

- assume you want to over-commit host and start
  inflating balloon.
  If low on memory it might be better for guest to
  wait a bit before inflating.
  Also, if host asks for a lot of memory a ton of
  allocations will slow guest significantly.
  But for guest to do the right thing we need host to tell guest what
  are its memory and time contraints.
  Let's add a field telling guest how hard do we
  want it to give us memory (e.g. time limit)
  


> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 9076635697bb..1dd45691b618 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,4 +1,4 @@
>  obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
>  obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
>  obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
> -obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> +obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o virtio_balloon2.o
> diff --git a/drivers/virtio/virtio_balloon2.c b/drivers/virtio/virtio_balloon2.c
> new file mode 100644
> index 000000000000..93f13e7c561d
> --- /dev/null
> +++ b/drivers/virtio/virtio_balloon2.c
> @@ -0,0 +1,566 @@
> +/*
> + * Virtio balloon implementation, inspired by Dor Laor and Marcelo
> + * Tosatti's implementations.
> + *
> + *  Copyright 2008, 2014 Rusty Russell IBM Corporation
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, write to the Free Software
> + *  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> + */
> +
> +#include <linux/virtio.h>
> +#include <linux/virtio_balloon.h>
> +#include <linux/swap.h>
> +#include <linux/kthread.h>
> +#include <linux/freezer.h>
> +#include <linux/delay.h>
> +#include <linux/slab.h>
> +#include <linux/module.h>
> +#include <linux/balloon_compaction.h>
> +
> +struct gcmd_get_pages {
> +	__le64 type; /* VIRTIO_BALLOON_GCMD_GET_PAGES */
> +	__le64 pages[256];
> +};
> +
> +struct gcmd_give_pages {
> +	__le64 type; /* VIRTIO_BALLOON_GCMD_GIVE_PAGES */
> +	__le64 pages[256];
> +};
> +
> +struct gcmd_need_mem {
> +	__le64 type; /* VIRTIO_BALLOON_GCMD_NEED_MEM */
> +};
> +
> +struct gcmd_stats_reply {
> +	__le64 type; /* VIRTIO_BALLOON_GCMD_STATS_REPLY */
> +	struct virtio_balloon_statistic stats[VIRTIO_BALLOON_S_NR];
> +};
> +
> +struct hcmd_set_balloon {
> +	__le64 type; /* VIRTIO_BALLOON_HCMD_SET_BALLOON */
> +	__le64 target;
> +};
> +
> +struct hcmd_get_stats {
> +	__le64 type; /* VIRTIO_BALLOON_HCMD_GET_STATS */
> +};
> +
> +struct virtio_balloon {
> +	/* Protects contents of entire structure. */
> +	struct mutex lock;
> +
> +	struct virtio_device *vdev;
> +	struct virtqueue *gcmd_vq, *hcmd_vq;
> +
> +	/* The thread servicing the balloon. */
> +	struct task_struct *thread;
> +
> +	/* For interrupt/suspend to wake balloon thread. */
> +	wait_queue_head_t wait;
> +
> +	/* How many pages are we supposed to have in balloon? */
> +	s64 target;
> +
> +	/* How many do we have in the balloon? */
> +	u64 num_pages;
> +
> +	/* This reminds me of Eeyore. */
> +	bool broken;
> +
> +	/*
> +	 * The pages we've told the Host we're not using are enqueued
> +	 * at vb_dev_info->pages list.
> +	 */
> +	struct balloon_dev_info *vb_dev_info;
> +
> +	/* To avoid kmalloc, we use single hcmd and gcmd buffers. */
> +	union gcmd {
> +		__le64 type;
> +		struct gcmd_get_pages get_pages;
> +		struct gcmd_give_pages give_pages;
> +		struct gcmd_need_mem need_mem;
> +		struct gcmd_stats_reply stats_reply;
> +	} gcmd;
> +
> +	union hcmd {
> +		__le64 type;
> +		struct hcmd_set_balloon set_balloon;
> +		struct hcmd_get_stats get_stats;
> +	} hcmd;
> +};
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_MEMBALLOON, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static void wake_balloon(struct virtqueue *vq)
> +{
> +	struct virtio_balloon *vb = vq->vdev->priv;
> +
> +	wake_up(&vb->wait);
> +}
> +
> +/* Command is in vb->gcmd, lock is held. */
> +static bool send_gcmd(struct virtio_balloon *vb, size_t len)
> +{
> +	struct scatterlist sg;
> +
> +	BUG_ON(len > sizeof(vb->gcmd));
> +	sg_init_one(&sg, &vb->gcmd, len);
> +
> +	/*
> +	 * We should always be able to add one buffer to an empty queue.
> +	 * If not, it's a broken device.
> +	 */
> +	if (virtqueue_add_outbuf(vb->gcmd_vq, &sg, 1, vb, GFP_KERNEL) != 0
> +	    || virtqueue_kick(vb->gcmd_vq) != 0) {
> +		vb->broken = true;
> +		return false;
> +	}
> +
> +	/* When host has read buffer, this completes via wake_balloon */
> +	wait_event(vb->wait,
> +		   virtqueue_get_buf(vb->gcmd_vq, &len)
> +		   || (vb->broken = virtqueue_is_broken(vb->gcmd_vq)));
> +	return !vb->broken;
> +}
> +
> +static void give_to_balloon(struct virtio_balloon *vb, u64 num)
> +{
> +	struct balloon_dev_info *vb_dev_info = vb->vb_dev_info;
> +	u64 i;
> +
> +	/* We can only do one array worth at a time. */
> +	num = min_t(u64, num, ARRAY_SIZE(vb->gcmd.give_pages.pages));
> +
> +	vb->gcmd.give_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GIVE_PAGES);
> +
> +	for (i = 0; i < num; i++) {
> +		struct page *page = balloon_page_enqueue(vb_dev_info);
> +
> +		if (!page) {
> +			dev_info_ratelimited(&vb->vdev->dev,
> +					     "Out of puff! Can't get page\n");
> +			/* Sleep for at least 1/5 of a second before retry. */
> +			msleep(200);
> +			break;
> +		}
> +
> +		vb->gcmd.give_pages.pages[i] = page_to_pfn(page) << PAGE_SHIFT;
> +		vb->num_pages++;
> +		adjust_managed_page_count(page, -1);
> +	}
> +
> +	/* Did we get any? */
> +	if (i)
> +		send_gcmd(vb, offsetof(struct gcmd_give_pages, pages[i]));
> +}
> +
> +static void take_from_balloon(struct virtio_balloon *vb, u64 num)
> +{
> +	struct balloon_dev_info *vb_dev_info = vb->vb_dev_info;
> +	size_t i;
> +
> +	/* We can only do one array worth at a time. */
> +	num = min_t(u64, num, ARRAY_SIZE(vb->gcmd.get_pages.pages));
> +
> +	vb->gcmd.get_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GET_PAGES);
> +
> +	for (i = 0; i < num; i++) {
> +		struct page *page = balloon_page_dequeue(vb_dev_info);
> +
> +		/* In case we ran out of pages (compaction) */
> +		if (!page)
> +			break;
> +
> +		vb->gcmd.get_pages.pages[i] = page_to_pfn(page) << PAGE_SHIFT;
> +		vb->num_pages--;
> +	}
> +	num = i;
> +	if (num)
> +		send_gcmd(vb, offsetof(struct gcmd_get_pages, pages[num]));
> +
> +	/* Now release those pages. */
> +	for (i = 0; i < num; i++) {
> +		struct page *page;
> +
> +		page = pfn_to_page(vb->gcmd.get_pages.pages[i] >> PAGE_SHIFT);
> +		balloon_page_free(page);
> +		adjust_managed_page_count(page, 1);
> +	}
> +	mutex_unlock(&vb->lock);
> +}
> +
> +static inline void set_stat(struct gcmd_stats_reply *stats, int idx,
> +			    u64 tag, u64 val)
> +{
> +	BUG_ON(idx >= ARRAY_SIZE(stats->stats));
> +	stats->stats[idx].tag = cpu_to_le64(tag);
> +	stats->stats[idx].val = cpu_to_le64(val);
> +}
> +
> +#define pages_to_bytes(x) ((u64)(x) << PAGE_SHIFT)
> +
> +static void get_stats(struct gcmd_stats_reply *stats)
> +{
> +	unsigned long events[NR_VM_EVENT_ITEMS];
> +	struct sysinfo i;
> +	int idx = 0;
> +
> +	all_vm_events(events);
> +	si_meminfo(&i);
> +
> +	stats->type = cpu_to_le64(VIRTIO_BALLOON_GCMD_STATS_REPLY);
> +	set_stat(stats, idx++, VIRTIO_BALLOON_S_SWAP_IN,
> +		 pages_to_bytes(events[PSWPIN]));
> +	set_stat(stats, idx++, VIRTIO_BALLOON_S_SWAP_OUT,
> +		 pages_to_bytes(events[PSWPOUT]));
> +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MAJFLT,
> +		 events[PGMAJFAULT]);
> +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MINFLT,
> +		 events[PGFAULT]);
> +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MEMFREE,
> +		 pages_to_bytes(i.freeram));
> +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MEMTOT,
> +		 pages_to_bytes(i.totalram));
> +}
> +
> +static bool move_towards_target(struct virtio_balloon *vb)
> +{
> +	bool moved = false;
> +
> +	if (vb->broken)
> +		return false;
> +
> +	mutex_lock(&vb->lock);
> +	if (vb->num_pages < vb->target) {
> +		give_to_balloon(vb, vb->target - vb->num_pages);
> +		moved = true;
> +	} else if (vb->num_pages > vb->target) {
> +		take_from_balloon(vb, vb->num_pages - vb->target);
> +		moved = true;
> +	}
> +	mutex_unlock(&vb->lock);
> +	return moved;
> +}
> +
> +static bool process_hcmd(struct virtio_balloon *vb)
> +{
> +	union hcmd *hcmd = NULL;
> +	unsigned int cmdlen;
> +	struct scatterlist sg;
> +
> +	if (vb->broken)
> +		return false;
> +
> +	mutex_lock(&vb->lock);
> +	hcmd = virtqueue_get_buf(vb->hcmd_vq, &cmdlen);
> +	if (!hcmd) {
> +		mutex_unlock(&vb->lock);
> +		return false;
> +	}
> +
> +	switch (hcmd->type) {
> +	case cpu_to_le64(VIRTIO_BALLOON_HCMD_SET_BALLOON):
> +		vb->target = le64_to_cpu(hcmd->set_balloon.target);
> +		break;
> +	case cpu_to_le64(VIRTIO_BALLOON_HCMD_GET_STATS):
> +		get_stats(&vb->gcmd.stats_reply);
> +		send_gcmd(vb, sizeof(vb->gcmd.stats_reply));
> +		break;
> +	default:
> +		dev_err_ratelimited(&vb->vdev->dev, "Unknown hcmd %llu\n",
> +				    le64_to_cpu(hcmd->type));
> +		break;
> +	}
> +
> +	/* Re-queue the hcmd for next time. */
> +	sg_init_one(&sg, &vb->hcmd, sizeof(vb->hcmd));
> +	virtqueue_add_inbuf(vb->hcmd_vq, &sg, 1, vb, GFP_KERNEL);
> +
> +	mutex_unlock(&vb->lock);
> +	return true;
> +}
> +
> +static int balloon(void *_vballoon)
> +{
> +	struct virtio_balloon *vb = _vballoon;
> +
> +	set_freezable();
> +	while (!kthread_should_stop()) {
> +		try_to_freeze();
> +
> +		wait_event_interruptible(vb->wait,
> +					 kthread_should_stop()
> +					 || freezing(current)
> +					 || process_hcmd(vb)
> +					 || move_towards_target(vb));
> +	}
> +	return 0;
> +}
> +
> +static int init_vqs(struct virtio_balloon *vb)
> +{
> +	struct virtqueue *vqs[2];
> +	vq_callback_t *callbacks[] = { wake_balloon, wake_balloon };
> +	const char *names[] = { "gcmd", "hcmd" };
> +	struct scatterlist sg;
> +	int err;
> +
> +	err = vb->vdev->config->find_vqs(vb->vdev, 2, vqs, callbacks, names);
> +	if (err)
> +		return err;
> +
> +	vb->gcmd_vq = vqs[0];
> +	vb->hcmd_vq = vqs[1];
> +
> +	/*
> +	 * Prime this virtqueue with one buffer so the hypervisor can
> +	 * use it to signal us later (it can't be broken yet!).
> +	 */
> +	sg_init_one(&sg, &vb->hcmd, sizeof(vb->hcmd));
> +	if (virtqueue_add_inbuf(vb->hcmd_vq, &sg, 1, vb, GFP_KERNEL) < 0)
> +		BUG();
> +	virtqueue_kick(vb->hcmd_vq);
> +
> +	return 0;
> +}
> +
> +static const struct address_space_operations virtio_balloon_aops;
> +#ifdef CONFIG_BALLOON_COMPACTION
> +/*
> + * virtballoon_migratepage - perform the balloon page migration on behalf of
> + *			     a compation thread.     (called under page lock)
> + * @mapping: the page->mapping which will be assigned to the new migrated page.
> + * @newpage: page that will replace the isolated page after migration finishes.
> + * @page   : the isolated (old) page that is about to be migrated to newpage.
> + * @mode   : compaction mode -- not used for balloon page migration.
> + *
> + * After a ballooned page gets isolated by compaction procedures, this is the
> + * function that performs the page migration on behalf of a compaction thread
> + * The page migration for virtio balloon is done in a simple swap fashion which
> + * follows these two macro steps:
> + *  1) insert newpage into vb->pages list and update the host about it;
> + *  2) update the host about the old page removed from vb->pages list;
> + *
> + * This function preforms the balloon page migration task.
> + * Called through balloon_mapping->a_ops->migratepage
> + */
> +static int virtballoon_migratepage(struct address_space *mapping,
> +		struct page *newpage, struct page *page, enum migrate_mode mode)
> +{
> +	struct balloon_dev_info *vb_dev_info = balloon_page_device(page);
> +	struct virtio_balloon *vb;
> +	unsigned long flags;
> +	int err;
> +
> +	BUG_ON(!vb_dev_info);
> +
> +	vb = vb_dev_info->balloon_device;
> +
> +	/*
> +	 * In order to avoid lock contention while migrating pages concurrently
> +	 * to leak_balloon() or fill_balloon() we just give up the balloon_lock
> +	 * this turn, as it is easier to retry the page migration later.
> +	 * This also prevents fill_balloon() getting stuck into a mutex
> +	 * recursion in the case it ends up triggering memory compaction
> +	 * while it is attempting to inflate the ballon.
> +	 */
> +	if (!mutex_trylock(&vb->lock))
> +		return -EAGAIN;
> +
> +	/* Try to get the page out of the balloon. */
> +	vb->gcmd.get_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GET_PAGES);
> +	vb->gcmd.get_pages.pages[0] = page_to_pfn(page) << PAGE_SHIFT;
> +	if (!send_gcmd(vb, offsetof(struct gcmd_get_pages, pages[1]))) {
> +		err = -EIO;
> +		goto unlock;
> +	}
> +
> +	/* Now put newpage into balloon. */
> +	vb->gcmd.give_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GIVE_PAGES);
> +	vb->gcmd.give_pages.pages[0] = page_to_pfn(newpage) << PAGE_SHIFT;
> +	if (!send_gcmd(vb, offsetof(struct gcmd_give_pages, pages[1]))) {
> +		/* We leak a page here, but only happens if balloon broken. */
> +		err = -EIO;
> +		goto unlock;
> +	}
> +
> +	spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
> +	balloon_page_insert(newpage, mapping, &vb_dev_info->pages);
> +	vb_dev_info->isolated_pages--;
> +	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
> +
> +	/*
> +	 * It's safe to delete page->lru here because this page is at
> +	 * an isolated migration list, and this step is expected to happen here
> +	 */
> +	balloon_page_delete(page);
> +	err = MIGRATEPAGE_BALLOON_SUCCESS;
> +
> +unlock:
> +	mutex_unlock(&vb->lock);
> +	return err;
> +}
> +
> +/* define the balloon_mapping->a_ops callback to allow balloon page migration */
> +static const struct address_space_operations virtio_balloon_aops = {
> +			.migratepage = virtballoon_migratepage,
> +};
> +#endif /* CONFIG_BALLOON_COMPACTION */
> +
> +static int virtballoon_probe(struct virtio_device *vdev)
> +{
> +	struct virtio_balloon *vb;
> +	struct address_space *vb_mapping;
> +	struct balloon_dev_info *vb_devinfo;
> +	__le64 v;
> +	int err;
> +
> +	virtio_cread(vdev, struct virtio_balloon_config_space, pagesizes, &v);
> +	/* FIXME: Support large pages. */
> +	if (!(le64_to_cpu(v) & PAGE_SIZE)) {
> +		dev_warn(&vdev->dev, "Unacceptable pagesize %llu\n",
> +			 (long long)le64_to_cpu(v));
> +		err = -EINVAL;
> +		goto out;
> +	}
> +	v = cpu_to_le64(PAGE_SIZE);
> +	virtio_cwrite(vdev, struct virtio_balloon_config_space, page_size, &v);
> +
> +	vdev->priv = vb = kmalloc(sizeof(*vb), GFP_KERNEL);
> +	if (!vb) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +
> +	vb->target = 0;
> +	vb->num_pages = 0;
> +	mutex_init(&vb->lock);
> +	init_waitqueue_head(&vb->wait);
> +	vb->vdev = vdev;
> +
> +	vb_devinfo = balloon_devinfo_alloc(vb);
> +	if (IS_ERR(vb_devinfo)) {
> +		err = PTR_ERR(vb_devinfo);
> +		goto out_free_vb;
> +	}
> +
> +	vb_mapping = balloon_mapping_alloc(vb_devinfo,
> +					   (balloon_compaction_check()) ?
> +					   &virtio_balloon_aops : NULL);
> +	if (IS_ERR(vb_mapping)) {
> +		/*
> +		 * IS_ERR(vb_mapping) && PTR_ERR(vb_mapping) == -EOPNOTSUPP
> +		 * This means !CONFIG_BALLOON_COMPACTION, otherwise we get off.
> +		 */
> +		err = PTR_ERR(vb_mapping);
> +		if (err != -EOPNOTSUPP)
> +			goto out_free_vb_devinfo;
> +	}
> +
> +	vb->vb_dev_info = vb_devinfo;
> +
> +	err = init_vqs(vb);
> +	if (err)
> +		goto out_free_vb_mapping;
> +
> +	vb->thread = kthread_run(balloon, vb, "vballoon");
> +	if (IS_ERR(vb->thread)) {
> +		err = PTR_ERR(vb->thread);
> +		goto out_del_vqs;
> +	}
> +
> +	return 0;
> +
> +out_del_vqs:
> +	vdev->config->del_vqs(vdev);
> +out_free_vb_mapping:
> +	balloon_mapping_free(vb_mapping);
> +out_free_vb_devinfo:
> +	balloon_devinfo_free(vb_devinfo);
> +out_free_vb:
> +	kfree(vb);
> +out:
> +	return err;
> +}
> +
> +/* FIXME: Leave pages alone during suspend, rather than taking them
> + * all back! */
> +static void remove_common(struct virtio_balloon *vb)
> +{
> +	/* There might be pages left in the balloon: free them. */
> +	while (vb->num_pages)
> +		take_from_balloon(vb, vb->num_pages);
> +
> +	/* Now we reset the device so we can clean up the queues. */
> +	vb->vdev->config->reset(vb->vdev);
> +	vb->vdev->config->del_vqs(vb->vdev);
> +}
> +
> +static void virtballoon_remove(struct virtio_device *vdev)
> +{
> +	struct virtio_balloon *vb = vdev->priv;
> +
> +	kthread_stop(vb->thread);
> +	remove_common(vb);
> +	balloon_mapping_free(vb->vb_dev_info->mapping);
> +	balloon_devinfo_free(vb->vb_dev_info);
> +	kfree(vb);
> +}
> +
> +#ifdef CONFIG_PM_SLEEP
> +static int virtballoon_freeze(struct virtio_device *vdev)
> +{
> +	struct virtio_balloon *vb = vdev->priv;
> +
> +	/*
> +	 * The kthread is already frozen by the PM core before this
> +	 * function is called.
> +	 */
> +
> +	remove_common(vb);
> +	return 0;
> +}
> +
> +static int virtballoon_restore(struct virtio_device *vdev)
> +{
> +	return init_vqs(vdev->priv);
> +}
> +#endif
> +
> +static unsigned int features[] = {
> +	/* FIXME: Support VIRTIO_BALLOON_F_EXTRA_MEM! */
> +};
> +
> +static struct virtio_driver virtio_balloon_driver = {
> +	.feature_table = features,
> +	.feature_table_size = ARRAY_SIZE(features),
> +	.driver.name =	KBUILD_MODNAME,
> +	.driver.owner =	THIS_MODULE,
> +	.id_table =	id_table,
> +	.probe =	virtballoon_probe,
> +	.remove =	virtballoon_remove,
> +#ifdef CONFIG_PM_SLEEP
> +	.freeze	=	virtballoon_freeze,
> +	.restore =	virtballoon_restore,
> +#endif
> +};
> +
> +module_virtio_driver(virtio_balloon_driver);
> +MODULE_DEVICE_TABLE(virtio, id_table);
> +MODULE_DESCRIPTION("Virtio balloon driver");
> +MODULE_LICENSE("GPL");
> diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> index 5e26f61b5df5..cdca2934668a 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h
> @@ -28,32 +28,45 @@
>  #include <linux/virtio_ids.h>
>  #include <linux/virtio_config.h>
>  
> -/* The feature bitmap for virtio balloon */
> -#define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
> -#define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
> -
> -/* Size of a PFN in the balloon interface. */
> -#define VIRTIO_BALLOON_PFN_SHIFT 12
> -
> -struct virtio_balloon_config
> -{
> -	/* Number of pages host wants Guest to give up. */
> -	__le32 num_pages;
> -	/* Number of pages we've actually got in balloon. */
> -	__le32 actual;
> +/* This means the balloon can go negative (ie. add memory to system) */
> +#define VIRTIO_BALLOON_F_EXTRA_MEM	0
> +
> +struct virtio_balloon_config_space {
> +	/* Set by device: bits indicate what page sizes supported. */
> +	__le64 pagesizes;
> +	/* Set by driver: only a single bit is set! */
> +	__le64 page_size;
> +
> +	/* These set by device if VIRTIO_BALLOON_F_EXTRA_MEM. */
> +	__le64 extra_mem_start;
> +	__le64 extra_mem_end;
> +};
> +
> +struct virtio_balloon_statistic {
> +	__le64 tag; /* VIRTIO_BALLOON_S_* */
> +	__le64 val;
>  };
>  
> -#define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> -#define VIRTIO_BALLOON_S_SWAP_OUT 1   /* Amount of memory swapped out */
> -#define VIRTIO_BALLOON_S_MAJFLT   2   /* Number of major faults */
> -#define VIRTIO_BALLOON_S_MINFLT   3   /* Number of minor faults */
> -#define VIRTIO_BALLOON_S_MEMFREE  4   /* Total amount of free memory */
> -#define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
> -#define VIRTIO_BALLOON_S_NR       6
> -
> -struct virtio_balloon_stat {
> -	__u16 tag;
> -	__u64 val;
> -} __attribute__((packed));
> +/* Guest->host command queue. */
> +/* Ask the host for more pages.
> +   Followed by array of 1 or more readable le64 pageaddr's. */
> +#define VIRTIO_BALLOON_GCMD_GET_PAGES	((__le64)0)
> +/* Give the host more pages.
> +   Followed by array of 1 or more readable le64 pageaddr's */
> +#define VIRTIO_BALLOON_GCMD_GIVE_PAGES	((__le64)1)
> +/* Dear host: I need more memory. */
> +#define VIRTIO_BALLOON_GCMD_NEEDMEM	((__le64)2)
> +/* Dear host: here are your stats.
> + * Followed by 0 or more struct virtio_balloon_statistic structs. */
> +#define VIRTIO_BALLOON_GCMD_STATS_REPLY	((__le64)3)
> +
> +/* Host->guest command queue. */
> +/* Followed by s64 of new balloon target size (only negative if
> + * VIRTIO_BALLOON_F_EXTRA_MEM). */
> +#define VIRTIO_BALLOON_HCMD_SET_BALLOON	((__le64)0x8000)
> +/* Ask for statistics */
> +#define VIRTIO_BALLOON_HCMD_GET_STATS	((__le64)0x8001)
> +
> +#include <linux/virtio_balloon_legacy.h>
>  
>  #endif /* _LINUX_VIRTIO_BALLOON_H */
> diff --git a/include/uapi/linux/virtio_balloon_legacy.h b/include/uapi/linux/virtio_balloon_legacy.h
> new file mode 100644
> index 000000000000..cbf77bc1aee3
> --- /dev/null
> +++ b/include/uapi/linux/virtio_balloon_legacy.h
> @@ -0,0 +1,59 @@
> +#ifndef _LINUX_VIRTIO_BALLOON_LEGACY_H
> +#define _LINUX_VIRTIO_BALLOON_LEGACY_H
> +/* This header is BSD licensed so anyone can use the definitions to implement
> + * compatible drivers/servers.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE. */
> +#include <linux/virtio_ids.h>
> +#include <linux/virtio_config.h>
> +
> +/* The feature bitmap for virtio balloon */
> +#define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
> +#define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
> +
> +/* Size of a PFN in the balloon interface. */
> +#define VIRTIO_BALLOON_PFN_SHIFT 12
> +
> +struct virtio_balloon_config
> +{
> +	/* Number of pages host wants Guest to give up. */
> +	__le32 num_pages;
> +	/* Number of pages we've actually got in balloon. */
> +	__le32 actual;
> +};
> +
> +#define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> +#define VIRTIO_BALLOON_S_SWAP_OUT 1   /* Amount of memory swapped out */
> +#define VIRTIO_BALLOON_S_MAJFLT   2   /* Number of major faults */
> +#define VIRTIO_BALLOON_S_MINFLT   3   /* Number of minor faults */
> +#define VIRTIO_BALLOON_S_MEMFREE  4   /* Total amount of free memory */
> +#define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
> +#define VIRTIO_BALLOON_S_NR       6
> +
> +struct virtio_balloon_stat {
> +	__u16 tag;
> +	__u64 val;
> +} __attribute__((packed));
> +
> +#endif /* _LINUX_VIRTIO_BALLOON_LEGACY_H */
> diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> index 284fc3a05f7b..8b5ac0047190 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -33,11 +33,12 @@
>  #define VIRTIO_ID_BLOCK		2 /* virtio block */
>  #define VIRTIO_ID_CONSOLE	3 /* virtio console */
>  #define VIRTIO_ID_RNG		4 /* virtio rng */
> -#define VIRTIO_ID_BALLOON	5 /* virtio balloon */
> +#define VIRTIO_ID_BALLOON	5 /* virtio balloon (legacy) */
>  #define VIRTIO_ID_RPMSG		7 /* virtio remote processor messaging */
>  #define VIRTIO_ID_SCSI		8 /* virtio scsi */
>  #define VIRTIO_ID_9P		9 /* 9p virtio console */
>  #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
>  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
> +#define VIRTIO_ID_MEMBALLOON   13 /* virtio balloon */
>  
>  #endif /* _LINUX_VIRTIO_IDS_H */
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that 
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]