OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] New virtio balloon...


On Thu, 30 Jan 2014 12:16:29 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> Also copy virtio-dev since this in clearly implementation ...
> 
> On Thu, Jan 30, 2014 at 07:34:30PM +1030, Rusty Russell wrote:
> > Hi,
> > 
> >         I tried to write a new balloon driver; it's completely untested
> > (as I need to write the device).  The protocol is basically two vqs, one
> > for the guest to send commands, one for the host to send commands.
> > 
> > Some interesting things come out:
> > 1) We do need to explicitly tell the host where the page is we want.
> >    This is required for compaction, for example.
> > 
> > 2) We need to be able to exceed the balloon target, especially for page
> >    migration.  Thus there's no mechanism for the device to refuse to
> >    give us the pages.
> > 
> > 3) The device can offer multiple page sizes, but the driver can only
> >    accept one.  I'm not sure if this is useful, as guests are either
> >    huge page backed or not, and returning sub-pages isn't useful.
> > 
> > Linux demo code follows.
> > 
> > Cheers,
> > Rusty.
> 
> More comments:
> 	- for projects like auto-ballooning that Luiz works on,
> 	  it's not nice that to swap page 1 for page 2
> 	  you have to inflate then deflate
> 	  besides overhead this confuses the host:
> 	  imagine you tell QEMU to increase target,
> 	  meanwhile guest inflates temporarily,
> 	  QEMU thinks okay done, now you suddenly deflate.

Yes. Just to give more context: one of my auto-ballooning versions broke
when virtballoon_migratepage() ran. The reason was that my host-side code
in the balloon device did not expect guest initiated operations. And the
current spec does imply that all operations are initiated by the host.

So, first suggestion: if the current spec is still valid, we have to add
a note there that balloon operations can be initiated by the guest.

My current code is different, but something it does that could also brake
due to guest initiated inflate/deflate is that it keeps track of the
current balloon size. This is done by a counter which is incremented
on inflate and decremented on deflate. I did that because the device just
doesn't have this information ('actual' is unreliable, besides it's
only updated every 256 pages inflated/deflated).

Second suggestion: I think we need a reliable way to know the current
balloon size on the host. My counter does work, btw.

As far as the guest is concerned, my current code just informs the host
that the guest is facing pressure. This is done through a "message" virtqueue,
but I think this could just use the guest command virtqueue.

> A couple of other suggestions:
> 
> - how to accomodate memory pressure in guest?
>   Let's add a field telling host how hard do we
>   want our memory back

I agree we have to accommodate pressure in the guest some way, but what
you proposed is more or less related to auto-ballooning.

My suggestion would be for the host to tell the guest what to do in
case of pressure. Like, it could tell the guest to just keep trying like
it does today or it could ask the guest to stop inflation on pressure
(which would require an ack from the host, which complicates the
protocol a bit).

Also, there are two ways to know the guest is under pressure: 1. when
alloc_page() fails or 2. use in-kernel vmpressure notification like
auto-balloon does.

> - assume you want to over-commit host and start
>   inflating balloon.
>   If low on memory it might be better for guest to
>   wait a bit before inflating.
>   Also, if host asks for a lot of memory a ton of
>   allocations will slow guest significantly.
>   But for guest to do the right thing we need host to tell guest what
>   are its memory and time contraints.
>   Let's add a field telling guest how hard do we
>   want it to give us memory (e.g. time limit)

I think this is also related to auto-ballooning. Maybe we should start
with a simple device/driver and add all these features on top.

>   
> 
> 
> > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> > index 9076635697bb..1dd45691b618 100644
> > --- a/drivers/virtio/Makefile
> > +++ b/drivers/virtio/Makefile
> > @@ -1,4 +1,4 @@
> >  obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> >  obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
> >  obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
> > -obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
> > +obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o virtio_balloon2.o
> > diff --git a/drivers/virtio/virtio_balloon2.c b/drivers/virtio/virtio_balloon2.c
> > new file mode 100644
> > index 000000000000..93f13e7c561d
> > --- /dev/null
> > +++ b/drivers/virtio/virtio_balloon2.c
> > @@ -0,0 +1,566 @@
> > +/*
> > + * Virtio balloon implementation, inspired by Dor Laor and Marcelo
> > + * Tosatti's implementations.
> > + *
> > + *  Copyright 2008, 2014 Rusty Russell IBM Corporation
> > + *
> > + *  This program is free software; you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License as published by
> > + *  the Free Software Foundation; either version 2 of the License, or
> > + *  (at your option) any later version.
> > + *
> > + *  This program is distributed in the hope that it will be useful,
> > + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + *  GNU General Public License for more details.
> > + *
> > + *  You should have received a copy of the GNU General Public License
> > + *  along with this program; if not, write to the Free Software
> > + *  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> > + */
> > +
> > +#include <linux/virtio.h>
> > +#include <linux/virtio_balloon.h>
> > +#include <linux/swap.h>
> > +#include <linux/kthread.h>
> > +#include <linux/freezer.h>
> > +#include <linux/delay.h>
> > +#include <linux/slab.h>
> > +#include <linux/module.h>
> > +#include <linux/balloon_compaction.h>
> > +
> > +struct gcmd_get_pages {
> > +	__le64 type; /* VIRTIO_BALLOON_GCMD_GET_PAGES */
> > +	__le64 pages[256];
> > +};
> > +
> > +struct gcmd_give_pages {
> > +	__le64 type; /* VIRTIO_BALLOON_GCMD_GIVE_PAGES */
> > +	__le64 pages[256];
> > +};
> > +
> > +struct gcmd_need_mem {
> > +	__le64 type; /* VIRTIO_BALLOON_GCMD_NEED_MEM */
> > +};
> > +
> > +struct gcmd_stats_reply {
> > +	__le64 type; /* VIRTIO_BALLOON_GCMD_STATS_REPLY */
> > +	struct virtio_balloon_statistic stats[VIRTIO_BALLOON_S_NR];
> > +};
> > +
> > +struct hcmd_set_balloon {
> > +	__le64 type; /* VIRTIO_BALLOON_HCMD_SET_BALLOON */
> > +	__le64 target;
> > +};
> > +
> > +struct hcmd_get_stats {
> > +	__le64 type; /* VIRTIO_BALLOON_HCMD_GET_STATS */
> > +};
> > +
> > +struct virtio_balloon {
> > +	/* Protects contents of entire structure. */
> > +	struct mutex lock;
> > +
> > +	struct virtio_device *vdev;
> > +	struct virtqueue *gcmd_vq, *hcmd_vq;
> > +
> > +	/* The thread servicing the balloon. */
> > +	struct task_struct *thread;
> > +
> > +	/* For interrupt/suspend to wake balloon thread. */
> > +	wait_queue_head_t wait;
> > +
> > +	/* How many pages are we supposed to have in balloon? */
> > +	s64 target;
> > +
> > +	/* How many do we have in the balloon? */
> > +	u64 num_pages;
> > +
> > +	/* This reminds me of Eeyore. */
> > +	bool broken;
> > +
> > +	/*
> > +	 * The pages we've told the Host we're not using are enqueued
> > +	 * at vb_dev_info->pages list.
> > +	 */
> > +	struct balloon_dev_info *vb_dev_info;
> > +
> > +	/* To avoid kmalloc, we use single hcmd and gcmd buffers. */
> > +	union gcmd {
> > +		__le64 type;
> > +		struct gcmd_get_pages get_pages;
> > +		struct gcmd_give_pages give_pages;
> > +		struct gcmd_need_mem need_mem;
> > +		struct gcmd_stats_reply stats_reply;
> > +	} gcmd;
> > +
> > +	union hcmd {
> > +		__le64 type;
> > +		struct hcmd_set_balloon set_balloon;
> > +		struct hcmd_get_stats get_stats;
> > +	} hcmd;
> > +};
> > +
> > +static struct virtio_device_id id_table[] = {
> > +	{ VIRTIO_ID_MEMBALLOON, VIRTIO_DEV_ANY_ID },
> > +	{ 0 },
> > +};
> > +
> > +static void wake_balloon(struct virtqueue *vq)
> > +{
> > +	struct virtio_balloon *vb = vq->vdev->priv;
> > +
> > +	wake_up(&vb->wait);
> > +}
> > +
> > +/* Command is in vb->gcmd, lock is held. */
> > +static bool send_gcmd(struct virtio_balloon *vb, size_t len)
> > +{
> > +	struct scatterlist sg;
> > +
> > +	BUG_ON(len > sizeof(vb->gcmd));
> > +	sg_init_one(&sg, &vb->gcmd, len);
> > +
> > +	/*
> > +	 * We should always be able to add one buffer to an empty queue.
> > +	 * If not, it's a broken device.
> > +	 */
> > +	if (virtqueue_add_outbuf(vb->gcmd_vq, &sg, 1, vb, GFP_KERNEL) != 0
> > +	    || virtqueue_kick(vb->gcmd_vq) != 0) {
> > +		vb->broken = true;
> > +		return false;
> > +	}
> > +
> > +	/* When host has read buffer, this completes via wake_balloon */
> > +	wait_event(vb->wait,
> > +		   virtqueue_get_buf(vb->gcmd_vq, &len)
> > +		   || (vb->broken = virtqueue_is_broken(vb->gcmd_vq)));
> > +	return !vb->broken;
> > +}
> > +
> > +static void give_to_balloon(struct virtio_balloon *vb, u64 num)
> > +{
> > +	struct balloon_dev_info *vb_dev_info = vb->vb_dev_info;
> > +	u64 i;
> > +
> > +	/* We can only do one array worth at a time. */
> > +	num = min_t(u64, num, ARRAY_SIZE(vb->gcmd.give_pages.pages));
> > +
> > +	vb->gcmd.give_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GIVE_PAGES);
> > +
> > +	for (i = 0; i < num; i++) {
> > +		struct page *page = balloon_page_enqueue(vb_dev_info);
> > +
> > +		if (!page) {
> > +			dev_info_ratelimited(&vb->vdev->dev,
> > +					     "Out of puff! Can't get page\n");
> > +			/* Sleep for at least 1/5 of a second before retry. */
> > +			msleep(200);
> > +			break;
> > +		}
> > +
> > +		vb->gcmd.give_pages.pages[i] = page_to_pfn(page) << PAGE_SHIFT;
> > +		vb->num_pages++;
> > +		adjust_managed_page_count(page, -1);
> > +	}
> > +
> > +	/* Did we get any? */
> > +	if (i)
> > +		send_gcmd(vb, offsetof(struct gcmd_give_pages, pages[i]));
> > +}
> > +
> > +static void take_from_balloon(struct virtio_balloon *vb, u64 num)
> > +{
> > +	struct balloon_dev_info *vb_dev_info = vb->vb_dev_info;
> > +	size_t i;
> > +
> > +	/* We can only do one array worth at a time. */
> > +	num = min_t(u64, num, ARRAY_SIZE(vb->gcmd.get_pages.pages));
> > +
> > +	vb->gcmd.get_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GET_PAGES);
> > +
> > +	for (i = 0; i < num; i++) {
> > +		struct page *page = balloon_page_dequeue(vb_dev_info);
> > +
> > +		/* In case we ran out of pages (compaction) */
> > +		if (!page)
> > +			break;
> > +
> > +		vb->gcmd.get_pages.pages[i] = page_to_pfn(page) << PAGE_SHIFT;
> > +		vb->num_pages--;
> > +	}
> > +	num = i;
> > +	if (num)
> > +		send_gcmd(vb, offsetof(struct gcmd_get_pages, pages[num]));
> > +
> > +	/* Now release those pages. */
> > +	for (i = 0; i < num; i++) {
> > +		struct page *page;
> > +
> > +		page = pfn_to_page(vb->gcmd.get_pages.pages[i] >> PAGE_SHIFT);
> > +		balloon_page_free(page);
> > +		adjust_managed_page_count(page, 1);
> > +	}
> > +	mutex_unlock(&vb->lock);
> > +}
> > +
> > +static inline void set_stat(struct gcmd_stats_reply *stats, int idx,
> > +			    u64 tag, u64 val)
> > +{
> > +	BUG_ON(idx >= ARRAY_SIZE(stats->stats));
> > +	stats->stats[idx].tag = cpu_to_le64(tag);
> > +	stats->stats[idx].val = cpu_to_le64(val);
> > +}
> > +
> > +#define pages_to_bytes(x) ((u64)(x) << PAGE_SHIFT)
> > +
> > +static void get_stats(struct gcmd_stats_reply *stats)
> > +{
> > +	unsigned long events[NR_VM_EVENT_ITEMS];
> > +	struct sysinfo i;
> > +	int idx = 0;
> > +
> > +	all_vm_events(events);
> > +	si_meminfo(&i);
> > +
> > +	stats->type = cpu_to_le64(VIRTIO_BALLOON_GCMD_STATS_REPLY);
> > +	set_stat(stats, idx++, VIRTIO_BALLOON_S_SWAP_IN,
> > +		 pages_to_bytes(events[PSWPIN]));
> > +	set_stat(stats, idx++, VIRTIO_BALLOON_S_SWAP_OUT,
> > +		 pages_to_bytes(events[PSWPOUT]));
> > +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MAJFLT,
> > +		 events[PGMAJFAULT]);
> > +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MINFLT,
> > +		 events[PGFAULT]);
> > +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MEMFREE,
> > +		 pages_to_bytes(i.freeram));
> > +	set_stat(stats, idx++, VIRTIO_BALLOON_S_MEMTOT,
> > +		 pages_to_bytes(i.totalram));
> > +}
> > +
> > +static bool move_towards_target(struct virtio_balloon *vb)
> > +{
> > +	bool moved = false;
> > +
> > +	if (vb->broken)
> > +		return false;
> > +
> > +	mutex_lock(&vb->lock);
> > +	if (vb->num_pages < vb->target) {
> > +		give_to_balloon(vb, vb->target - vb->num_pages);
> > +		moved = true;
> > +	} else if (vb->num_pages > vb->target) {
> > +		take_from_balloon(vb, vb->num_pages - vb->target);
> > +		moved = true;
> > +	}
> > +	mutex_unlock(&vb->lock);
> > +	return moved;
> > +}
> > +
> > +static bool process_hcmd(struct virtio_balloon *vb)
> > +{
> > +	union hcmd *hcmd = NULL;
> > +	unsigned int cmdlen;
> > +	struct scatterlist sg;
> > +
> > +	if (vb->broken)
> > +		return false;
> > +
> > +	mutex_lock(&vb->lock);
> > +	hcmd = virtqueue_get_buf(vb->hcmd_vq, &cmdlen);
> > +	if (!hcmd) {
> > +		mutex_unlock(&vb->lock);
> > +		return false;
> > +	}
> > +
> > +	switch (hcmd->type) {
> > +	case cpu_to_le64(VIRTIO_BALLOON_HCMD_SET_BALLOON):
> > +		vb->target = le64_to_cpu(hcmd->set_balloon.target);
> > +		break;
> > +	case cpu_to_le64(VIRTIO_BALLOON_HCMD_GET_STATS):
> > +		get_stats(&vb->gcmd.stats_reply);
> > +		send_gcmd(vb, sizeof(vb->gcmd.stats_reply));
> > +		break;
> > +	default:
> > +		dev_err_ratelimited(&vb->vdev->dev, "Unknown hcmd %llu\n",
> > +				    le64_to_cpu(hcmd->type));
> > +		break;
> > +	}
> > +
> > +	/* Re-queue the hcmd for next time. */
> > +	sg_init_one(&sg, &vb->hcmd, sizeof(vb->hcmd));
> > +	virtqueue_add_inbuf(vb->hcmd_vq, &sg, 1, vb, GFP_KERNEL);
> > +
> > +	mutex_unlock(&vb->lock);
> > +	return true;
> > +}
> > +
> > +static int balloon(void *_vballoon)
> > +{
> > +	struct virtio_balloon *vb = _vballoon;
> > +
> > +	set_freezable();
> > +	while (!kthread_should_stop()) {
> > +		try_to_freeze();
> > +
> > +		wait_event_interruptible(vb->wait,
> > +					 kthread_should_stop()
> > +					 || freezing(current)
> > +					 || process_hcmd(vb)
> > +					 || move_towards_target(vb));
> > +	}
> > +	return 0;
> > +}
> > +
> > +static int init_vqs(struct virtio_balloon *vb)
> > +{
> > +	struct virtqueue *vqs[2];
> > +	vq_callback_t *callbacks[] = { wake_balloon, wake_balloon };
> > +	const char *names[] = { "gcmd", "hcmd" };
> > +	struct scatterlist sg;
> > +	int err;
> > +
> > +	err = vb->vdev->config->find_vqs(vb->vdev, 2, vqs, callbacks, names);
> > +	if (err)
> > +		return err;
> > +
> > +	vb->gcmd_vq = vqs[0];
> > +	vb->hcmd_vq = vqs[1];
> > +
> > +	/*
> > +	 * Prime this virtqueue with one buffer so the hypervisor can
> > +	 * use it to signal us later (it can't be broken yet!).
> > +	 */
> > +	sg_init_one(&sg, &vb->hcmd, sizeof(vb->hcmd));
> > +	if (virtqueue_add_inbuf(vb->hcmd_vq, &sg, 1, vb, GFP_KERNEL) < 0)
> > +		BUG();
> > +	virtqueue_kick(vb->hcmd_vq);
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct address_space_operations virtio_balloon_aops;
> > +#ifdef CONFIG_BALLOON_COMPACTION
> > +/*
> > + * virtballoon_migratepage - perform the balloon page migration on behalf of
> > + *			     a compation thread.     (called under page lock)
> > + * @mapping: the page->mapping which will be assigned to the new migrated page.
> > + * @newpage: page that will replace the isolated page after migration finishes.
> > + * @page   : the isolated (old) page that is about to be migrated to newpage.
> > + * @mode   : compaction mode -- not used for balloon page migration.
> > + *
> > + * After a ballooned page gets isolated by compaction procedures, this is the
> > + * function that performs the page migration on behalf of a compaction thread
> > + * The page migration for virtio balloon is done in a simple swap fashion which
> > + * follows these two macro steps:
> > + *  1) insert newpage into vb->pages list and update the host about it;
> > + *  2) update the host about the old page removed from vb->pages list;
> > + *
> > + * This function preforms the balloon page migration task.
> > + * Called through balloon_mapping->a_ops->migratepage
> > + */
> > +static int virtballoon_migratepage(struct address_space *mapping,
> > +		struct page *newpage, struct page *page, enum migrate_mode mode)
> > +{
> > +	struct balloon_dev_info *vb_dev_info = balloon_page_device(page);
> > +	struct virtio_balloon *vb;
> > +	unsigned long flags;
> > +	int err;
> > +
> > +	BUG_ON(!vb_dev_info);
> > +
> > +	vb = vb_dev_info->balloon_device;
> > +
> > +	/*
> > +	 * In order to avoid lock contention while migrating pages concurrently
> > +	 * to leak_balloon() or fill_balloon() we just give up the balloon_lock
> > +	 * this turn, as it is easier to retry the page migration later.
> > +	 * This also prevents fill_balloon() getting stuck into a mutex
> > +	 * recursion in the case it ends up triggering memory compaction
> > +	 * while it is attempting to inflate the ballon.
> > +	 */
> > +	if (!mutex_trylock(&vb->lock))
> > +		return -EAGAIN;
> > +
> > +	/* Try to get the page out of the balloon. */
> > +	vb->gcmd.get_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GET_PAGES);
> > +	vb->gcmd.get_pages.pages[0] = page_to_pfn(page) << PAGE_SHIFT;
> > +	if (!send_gcmd(vb, offsetof(struct gcmd_get_pages, pages[1]))) {
> > +		err = -EIO;
> > +		goto unlock;
> > +	}
> > +
> > +	/* Now put newpage into balloon. */
> > +	vb->gcmd.give_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GIVE_PAGES);
> > +	vb->gcmd.give_pages.pages[0] = page_to_pfn(newpage) << PAGE_SHIFT;
> > +	if (!send_gcmd(vb, offsetof(struct gcmd_give_pages, pages[1]))) {
> > +		/* We leak a page here, but only happens if balloon broken. */
> > +		err = -EIO;
> > +		goto unlock;
> > +	}
> > +
> > +	spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
> > +	balloon_page_insert(newpage, mapping, &vb_dev_info->pages);
> > +	vb_dev_info->isolated_pages--;
> > +	spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);
> > +
> > +	/*
> > +	 * It's safe to delete page->lru here because this page is at
> > +	 * an isolated migration list, and this step is expected to happen here
> > +	 */
> > +	balloon_page_delete(page);
> > +	err = MIGRATEPAGE_BALLOON_SUCCESS;
> > +
> > +unlock:
> > +	mutex_unlock(&vb->lock);
> > +	return err;
> > +}
> > +
> > +/* define the balloon_mapping->a_ops callback to allow balloon page migration */
> > +static const struct address_space_operations virtio_balloon_aops = {
> > +			.migratepage = virtballoon_migratepage,
> > +};
> > +#endif /* CONFIG_BALLOON_COMPACTION */
> > +
> > +static int virtballoon_probe(struct virtio_device *vdev)
> > +{
> > +	struct virtio_balloon *vb;
> > +	struct address_space *vb_mapping;
> > +	struct balloon_dev_info *vb_devinfo;
> > +	__le64 v;
> > +	int err;
> > +
> > +	virtio_cread(vdev, struct virtio_balloon_config_space, pagesizes, &v);
> > +	/* FIXME: Support large pages. */
> > +	if (!(le64_to_cpu(v) & PAGE_SIZE)) {
> > +		dev_warn(&vdev->dev, "Unacceptable pagesize %llu\n",
> > +			 (long long)le64_to_cpu(v));
> > +		err = -EINVAL;
> > +		goto out;
> > +	}
> > +	v = cpu_to_le64(PAGE_SIZE);
> > +	virtio_cwrite(vdev, struct virtio_balloon_config_space, page_size, &v);
> > +
> > +	vdev->priv = vb = kmalloc(sizeof(*vb), GFP_KERNEL);
> > +	if (!vb) {
> > +		err = -ENOMEM;
> > +		goto out;
> > +	}
> > +
> > +	vb->target = 0;
> > +	vb->num_pages = 0;
> > +	mutex_init(&vb->lock);
> > +	init_waitqueue_head(&vb->wait);
> > +	vb->vdev = vdev;
> > +
> > +	vb_devinfo = balloon_devinfo_alloc(vb);
> > +	if (IS_ERR(vb_devinfo)) {
> > +		err = PTR_ERR(vb_devinfo);
> > +		goto out_free_vb;
> > +	}
> > +
> > +	vb_mapping = balloon_mapping_alloc(vb_devinfo,
> > +					   (balloon_compaction_check()) ?
> > +					   &virtio_balloon_aops : NULL);
> > +	if (IS_ERR(vb_mapping)) {
> > +		/*
> > +		 * IS_ERR(vb_mapping) && PTR_ERR(vb_mapping) == -EOPNOTSUPP
> > +		 * This means !CONFIG_BALLOON_COMPACTION, otherwise we get off.
> > +		 */
> > +		err = PTR_ERR(vb_mapping);
> > +		if (err != -EOPNOTSUPP)
> > +			goto out_free_vb_devinfo;
> > +	}
> > +
> > +	vb->vb_dev_info = vb_devinfo;
> > +
> > +	err = init_vqs(vb);
> > +	if (err)
> > +		goto out_free_vb_mapping;
> > +
> > +	vb->thread = kthread_run(balloon, vb, "vballoon");
> > +	if (IS_ERR(vb->thread)) {
> > +		err = PTR_ERR(vb->thread);
> > +		goto out_del_vqs;
> > +	}
> > +
> > +	return 0;
> > +
> > +out_del_vqs:
> > +	vdev->config->del_vqs(vdev);
> > +out_free_vb_mapping:
> > +	balloon_mapping_free(vb_mapping);
> > +out_free_vb_devinfo:
> > +	balloon_devinfo_free(vb_devinfo);
> > +out_free_vb:
> > +	kfree(vb);
> > +out:
> > +	return err;
> > +}
> > +
> > +/* FIXME: Leave pages alone during suspend, rather than taking them
> > + * all back! */
> > +static void remove_common(struct virtio_balloon *vb)
> > +{
> > +	/* There might be pages left in the balloon: free them. */
> > +	while (vb->num_pages)
> > +		take_from_balloon(vb, vb->num_pages);
> > +
> > +	/* Now we reset the device so we can clean up the queues. */
> > +	vb->vdev->config->reset(vb->vdev);
> > +	vb->vdev->config->del_vqs(vb->vdev);
> > +}
> > +
> > +static void virtballoon_remove(struct virtio_device *vdev)
> > +{
> > +	struct virtio_balloon *vb = vdev->priv;
> > +
> > +	kthread_stop(vb->thread);
> > +	remove_common(vb);
> > +	balloon_mapping_free(vb->vb_dev_info->mapping);
> > +	balloon_devinfo_free(vb->vb_dev_info);
> > +	kfree(vb);
> > +}
> > +
> > +#ifdef CONFIG_PM_SLEEP
> > +static int virtballoon_freeze(struct virtio_device *vdev)
> > +{
> > +	struct virtio_balloon *vb = vdev->priv;
> > +
> > +	/*
> > +	 * The kthread is already frozen by the PM core before this
> > +	 * function is called.
> > +	 */
> > +
> > +	remove_common(vb);
> > +	return 0;
> > +}
> > +
> > +static int virtballoon_restore(struct virtio_device *vdev)
> > +{
> > +	return init_vqs(vdev->priv);
> > +}
> > +#endif
> > +
> > +static unsigned int features[] = {
> > +	/* FIXME: Support VIRTIO_BALLOON_F_EXTRA_MEM! */
> > +};
> > +
> > +static struct virtio_driver virtio_balloon_driver = {
> > +	.feature_table = features,
> > +	.feature_table_size = ARRAY_SIZE(features),
> > +	.driver.name =	KBUILD_MODNAME,
> > +	.driver.owner =	THIS_MODULE,
> > +	.id_table =	id_table,
> > +	.probe =	virtballoon_probe,
> > +	.remove =	virtballoon_remove,
> > +#ifdef CONFIG_PM_SLEEP
> > +	.freeze	=	virtballoon_freeze,
> > +	.restore =	virtballoon_restore,
> > +#endif
> > +};
> > +
> > +module_virtio_driver(virtio_balloon_driver);
> > +MODULE_DEVICE_TABLE(virtio, id_table);
> > +MODULE_DESCRIPTION("Virtio balloon driver");
> > +MODULE_LICENSE("GPL");
> > diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> > index 5e26f61b5df5..cdca2934668a 100644
> > --- a/include/uapi/linux/virtio_balloon.h
> > +++ b/include/uapi/linux/virtio_balloon.h
> > @@ -28,32 +28,45 @@
> >  #include <linux/virtio_ids.h>
> >  #include <linux/virtio_config.h>
> >  
> > -/* The feature bitmap for virtio balloon */
> > -#define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
> > -#define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
> > -
> > -/* Size of a PFN in the balloon interface. */
> > -#define VIRTIO_BALLOON_PFN_SHIFT 12
> > -
> > -struct virtio_balloon_config
> > -{
> > -	/* Number of pages host wants Guest to give up. */
> > -	__le32 num_pages;
> > -	/* Number of pages we've actually got in balloon. */
> > -	__le32 actual;
> > +/* This means the balloon can go negative (ie. add memory to system) */
> > +#define VIRTIO_BALLOON_F_EXTRA_MEM	0
> > +
> > +struct virtio_balloon_config_space {
> > +	/* Set by device: bits indicate what page sizes supported. */
> > +	__le64 pagesizes;
> > +	/* Set by driver: only a single bit is set! */
> > +	__le64 page_size;
> > +
> > +	/* These set by device if VIRTIO_BALLOON_F_EXTRA_MEM. */
> > +	__le64 extra_mem_start;
> > +	__le64 extra_mem_end;
> > +};
> > +
> > +struct virtio_balloon_statistic {
> > +	__le64 tag; /* VIRTIO_BALLOON_S_* */
> > +	__le64 val;
> >  };
> >  
> > -#define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> > -#define VIRTIO_BALLOON_S_SWAP_OUT 1   /* Amount of memory swapped out */
> > -#define VIRTIO_BALLOON_S_MAJFLT   2   /* Number of major faults */
> > -#define VIRTIO_BALLOON_S_MINFLT   3   /* Number of minor faults */
> > -#define VIRTIO_BALLOON_S_MEMFREE  4   /* Total amount of free memory */
> > -#define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
> > -#define VIRTIO_BALLOON_S_NR       6
> > -
> > -struct virtio_balloon_stat {
> > -	__u16 tag;
> > -	__u64 val;
> > -} __attribute__((packed));
> > +/* Guest->host command queue. */
> > +/* Ask the host for more pages.
> > +   Followed by array of 1 or more readable le64 pageaddr's. */
> > +#define VIRTIO_BALLOON_GCMD_GET_PAGES	((__le64)0)
> > +/* Give the host more pages.
> > +   Followed by array of 1 or more readable le64 pageaddr's */
> > +#define VIRTIO_BALLOON_GCMD_GIVE_PAGES	((__le64)1)
> > +/* Dear host: I need more memory. */
> > +#define VIRTIO_BALLOON_GCMD_NEEDMEM	((__le64)2)
> > +/* Dear host: here are your stats.
> > + * Followed by 0 or more struct virtio_balloon_statistic structs. */
> > +#define VIRTIO_BALLOON_GCMD_STATS_REPLY	((__le64)3)
> > +
> > +/* Host->guest command queue. */
> > +/* Followed by s64 of new balloon target size (only negative if
> > + * VIRTIO_BALLOON_F_EXTRA_MEM). */
> > +#define VIRTIO_BALLOON_HCMD_SET_BALLOON	((__le64)0x8000)
> > +/* Ask for statistics */
> > +#define VIRTIO_BALLOON_HCMD_GET_STATS	((__le64)0x8001)
> > +
> > +#include <linux/virtio_balloon_legacy.h>
> >  
> >  #endif /* _LINUX_VIRTIO_BALLOON_H */
> > diff --git a/include/uapi/linux/virtio_balloon_legacy.h b/include/uapi/linux/virtio_balloon_legacy.h
> > new file mode 100644
> > index 000000000000..cbf77bc1aee3
> > --- /dev/null
> > +++ b/include/uapi/linux/virtio_balloon_legacy.h
> > @@ -0,0 +1,59 @@
> > +#ifndef _LINUX_VIRTIO_BALLOON_LEGACY_H
> > +#define _LINUX_VIRTIO_BALLOON_LEGACY_H
> > +/* This header is BSD licensed so anyone can use the definitions to implement
> > + * compatible drivers/servers.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + * 3. Neither the name of IBM nor the names of its contributors
> > + *    may be used to endorse or promote products derived from this software
> > + *    without specific prior written permission.
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE. */
> > +#include <linux/virtio_ids.h>
> > +#include <linux/virtio_config.h>
> > +
> > +/* The feature bitmap for virtio balloon */
> > +#define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
> > +#define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
> > +
> > +/* Size of a PFN in the balloon interface. */
> > +#define VIRTIO_BALLOON_PFN_SHIFT 12
> > +
> > +struct virtio_balloon_config
> > +{
> > +	/* Number of pages host wants Guest to give up. */
> > +	__le32 num_pages;
> > +	/* Number of pages we've actually got in balloon. */
> > +	__le32 actual;
> > +};
> > +
> > +#define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> > +#define VIRTIO_BALLOON_S_SWAP_OUT 1   /* Amount of memory swapped out */
> > +#define VIRTIO_BALLOON_S_MAJFLT   2   /* Number of major faults */
> > +#define VIRTIO_BALLOON_S_MINFLT   3   /* Number of minor faults */
> > +#define VIRTIO_BALLOON_S_MEMFREE  4   /* Total amount of free memory */
> > +#define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
> > +#define VIRTIO_BALLOON_S_NR       6
> > +
> > +struct virtio_balloon_stat {
> > +	__u16 tag;
> > +	__u64 val;
> > +} __attribute__((packed));
> > +
> > +#endif /* _LINUX_VIRTIO_BALLOON_LEGACY_H */
> > diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> > index 284fc3a05f7b..8b5ac0047190 100644
> > --- a/include/uapi/linux/virtio_ids.h
> > +++ b/include/uapi/linux/virtio_ids.h
> > @@ -33,11 +33,12 @@
> >  #define VIRTIO_ID_BLOCK		2 /* virtio block */
> >  #define VIRTIO_ID_CONSOLE	3 /* virtio console */
> >  #define VIRTIO_ID_RNG		4 /* virtio rng */
> > -#define VIRTIO_ID_BALLOON	5 /* virtio balloon */
> > +#define VIRTIO_ID_BALLOON	5 /* virtio balloon (legacy) */
> >  #define VIRTIO_ID_RPMSG		7 /* virtio remote processor messaging */
> >  #define VIRTIO_ID_SCSI		8 /* virtio scsi */
> >  #define VIRTIO_ID_9P		9 /* 9p virtio console */
> >  #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
> >  #define VIRTIO_ID_CAIF	       12 /* Virtio caif */
> > +#define VIRTIO_ID_MEMBALLOON   13 /* virtio balloon */
> >  
> >  #endif /* _LINUX_VIRTIO_IDS_H */
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe from this mail list, you must leave the OASIS TC that 
> > generates this mail.  Follow this link to all your TCs in OASIS at:
> > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]