OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] New virtio balloon...


"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Fri, Jan 31, 2014 at 04:01:39PM +1030, Rusty Russell wrote:
>> I originally allowed the host to deny the deflate, which was why I
>> reversed it.  Then I realized that was a bad idea.  I can switch it back.
>
> I think explicit swap that you suggested sounds better to me.

OK, I've added this.  It takes 2N PFNS, though Linux only uses 2.

>> > 	- what's the status of page returned from balloon?
>> > 	  is it zeroed or can it have old data in there?
>> > 	  I think in practice Linux will sometimes map in a zero page,
>> > 	  so guest can save cycles and avoid zeroing it out.
>> > 	  I think we should tell this to guest when returning
>> > 	  pages.
>> 
>> QEMU may not know, since the kernel may not tell it.
>
> Depends on what QEMU does.
> I think kernel always gives us zero pages when we allocate
> memory, they must be initialized otherwise it's an information leak.

MADV_DONTNEED is a bit of a mess.  Under Linux it will zero pages,
under POSIX (POSIX_MADV_DONTNEED) it will keep them intact.  Under
FreeBSD we'd really want MADV_FREE, but linux doesn't support it.

Let's not design this for today's QEMU.

>>  We should assume
>> nothing, and let the guest zero if it needs to.  Seems like a premuture
>> optimization.
>
> Possibly.
>
>> > 	- I am guessing EXTRA_MEM is for uses like the ones proposed by
>> > 	  Frank Swiderski from google that inflate/deflate balloon
>> >           whenever guest wants (look for "Add a page cache-backed balloon
>> > 	  device driver").
>> >
>> >           this is useful but - we need to distinguish pages
>> > 	  like this from regular inflate.
>> > 	  it's not just counter and host needs a way to know
>> > 	  that it's target is reached
>> 
>> The driver needs to explicitly ask for pages in that region.
>
> OK so we'll have an extra flag for that?

No, I mean that the driver explicitly requests every page by PFN.  If it
requests a page in the extramem region, it will get one.

>> > 	- do we even want to allow guest not telling host when it wants
>> > 	  to reuse the page?
>> > 	  if yes, I think this should be per-page somehow: when balloon
>> > 	  is inflated guest should tell host whether it
>> > 	  expects to use this page.
>> 
>> I decided against it.  Making that optional got us into a mess, so now
>> it's compulsory.  That also fits better with the idea of a negative
>> balloon.
>> 
>> > So I think we should accomodate these uses, and so we want the following flags:
>> >
>> > 	- WEAK_TARGET (that's the EXTRA_MEM but I think done in a better way)
>> >           flag that specifies pages do not count against target,
>> > 	  can be taken out of balloon.
>> > 	  EXTRA_MEM suggests there's an upper limit on balloon size
>> > 	  but IMHO that's just extra work for host: host does not care
>> > 	  I think, give it as much as you want.
>> > 	  set by guest, used by host
>> 
>> I think that Daniel really does want more memory than the guest starts
>> with.  And I think he still wants to use the balloon to control it.
>> Daniel?
>> 
>> > 	- TELL_HOST flag that specifies guest will tell host before using pages
>> > 	  (that's VIRTIO_BALLOON_F_MUST_TELL_HOST
>> > 	  at the moment, listed here for completeness)
>> > 	  set by guest, used by host
>> 
>> Dislike.
>> 
>> > 	- ZEROED
>> > 	  flag that specifies that page returned to guest
>> > 	  is zeroed
>> > 	  set by host, used by guest
>> 
>> I think that's silly.  Under Linux the guest doesn't need to know it's
>> zeroed or not, it just frees the page.
>
> Yes but it's possible that linux will try to zero page right
> after free. It won't be too hard to set a flag that it's
> zeroed when we free it.

We could, but I don't see that this is a bottleneck.  See above.

>> > - how to accomodate memory pressure in guest?
>> >   Let's add a field telling host how hard do we
>> >   want our memory back
>> 
>> That's very hard to define across guests.  Should we be using stats for
>> that instead?  In fact, should we allow gratuitous stats sending,
>> instead of a simple NEED_MEM flag?
>> 
>> > - assume you want to over-commit host and start
>> >   inflating balloon.
>> >   If low on memory it might be better for guest to
>> >   wait a bit before inflating.
>> >   Also, if host asks for a lot of memory a ton of
>> >   allocations will slow guest significantly.
>> >   But for guest to do the right thing we need host to tell guest what
>> >   are its memory and time contraints.
>> >   Let's add a field telling guest how hard do we
>> >   want it to give us memory (e.g. time limit)
>> 
>> We can't have intelligence at both ends, I think.  We've chosen a
>> host-led model, so we should stick to that
>
> I'm saying let's control speed of allocations from host,
> that's still host-led?

You want the guest to wait a bit, and control the rate at which it
allocates memory.  If that's what we want, let's get the host to delay
telling it to inflate, and then inflate slowly.  Otherwise we have to
debug both host and guest sides when we hit performance problems.

I changed the STATS_REPLY to STATS, and included a "want more mem"
flag.  The implication is that the host compare stats across different
guests.

Cheers,
Rusty.

diff --git a/drivers/virtio/virtio_balloon2.c b/drivers/virtio/virtio_balloon2.c
index 93f13e7c561d..6c0151f12f8b 100644
--- a/drivers/virtio/virtio_balloon2.c
+++ b/drivers/virtio/virtio_balloon2.c
@@ -39,12 +39,15 @@ struct gcmd_give_pages {
 	__le64 pages[256];
 };
 
-struct gcmd_need_mem {
-	__le64 type; /* VIRTIO_BALLOON_GCMD_NEED_MEM */
+struct gcmd_exchange_pages {
+	__le64 type; /* VIRTIO_BALLOON_GCMD_EXCHANGE_PAGES */
+	__le64 from_balloon;
+	__le64 to_balloon;
 };
 
-struct gcmd_stats_reply {
+struct gcmd_stats {
 	__le64 type; /* VIRTIO_BALLOON_GCMD_STATS_REPLY */
+	__le64 need_more;
 	struct virtio_balloon_statistic stats[VIRTIO_BALLOON_S_NR];
 };
 
@@ -90,8 +93,8 @@ struct virtio_balloon {
 		__le64 type;
 		struct gcmd_get_pages get_pages;
 		struct gcmd_give_pages give_pages;
-		struct gcmd_need_mem need_mem;
-		struct gcmd_stats_reply stats_reply;
+		struct gcmd_exchange_pages exchange_pages;
+		struct gcmd_stats stats;
 	} gcmd;
 
 	union hcmd {
@@ -382,19 +385,11 @@ static int virtballoon_migratepage(struct address_space *mapping,
 	if (!mutex_trylock(&vb->lock))
 		return -EAGAIN;
 
-	/* Try to get the page out of the balloon. */
-	vb->gcmd.get_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GET_PAGES);
-	vb->gcmd.get_pages.pages[0] = page_to_pfn(page) << PAGE_SHIFT;
-	if (!send_gcmd(vb, offsetof(struct gcmd_get_pages, pages[1]))) {
-		err = -EIO;
-		goto unlock;
-	}
-
-	/* Now put newpage into balloon. */
-	vb->gcmd.give_pages.type = cpu_to_le64(VIRTIO_BALLOON_GCMD_GIVE_PAGES);
-	vb->gcmd.give_pages.pages[0] = page_to_pfn(newpage) << PAGE_SHIFT;
-	if (!send_gcmd(vb, offsetof(struct gcmd_give_pages, pages[1]))) {
-		/* We leak a page here, but only happens if balloon broken. */
+	vb->gcmd.exchange_pages.type =
+		cpu_to_le64(VIRTIO_BALLOON_GCMD_EXCHANGE_PAGES);
+	vb->gcmd.exchange_pages.from_balloon = page_to_pfn(page) << PAGE_SHIFT;
+	vb->gcmd.exchange_pages.to_balloon = page_to_pfn(newpage) << PAGE_SHIFT;
+	if (!send_gcmd(vb, sizeof(vb->gcmd.exchange_pages))) {
 		err = -EIO;
 		goto unlock;
 	}
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index cdca2934668a..925d79ad5c90 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -48,21 +48,36 @@ struct virtio_balloon_statistic {
 };
 
 /* Guest->host command queue. */
-/* Ask the host for more pages.
-   Followed by array of 1 or more readable le64 pageaddr's. */
+
+/*
+ * Ask the host for more pages.
+ * Followed by array of 1 or more readable le64 pageaddr's.
+ */
 #define VIRTIO_BALLOON_GCMD_GET_PAGES	((__le64)0)
-/* Give the host more pages.
-   Followed by array of 1 or more readable le64 pageaddr's */
+/*
+ * Give the host more pages.
+ * Followed by array of 1 or more readable le64 pageaddr's
+ */
 #define VIRTIO_BALLOON_GCMD_GIVE_PAGES	((__le64)1)
-/* Dear host: I need more memory. */
-#define VIRTIO_BALLOON_GCMD_NEEDMEM	((__le64)2)
-/* Dear host: here are your stats.
- * Followed by 0 or more struct virtio_balloon_statistic structs. */
-#define VIRTIO_BALLOON_GCMD_STATS_REPLY	((__le64)3)
+/*
+ * Exchange pages in the ballon.
+ * Followed by array of 2N readable le64 pageaddr's.  First N: to extract from
+ * balloon, next N: to add to the balloon
+*/
+#define VIRTIO_BALLOON_GCMD_EXCHANGE_PAGES ((__le64)2)
+/*
+ * Stats, and optional request for memory.
+ * __le64: 0 if we don't want target increased, 1 if we do.
+ * Followed by 0 or more struct virtio_balloon_statistic structs.
+ */
+#define VIRTIO_BALLOON_GCMD_STATS	((__le64)3)
 
 /* Host->guest command queue. */
-/* Followed by s64 of new balloon target size (only negative if
- * VIRTIO_BALLOON_F_EXTRA_MEM). */
+
+/*
+ * Followed by s64 of new balloon target size (only negative if
+ * VIRTIO_BALLOON_F_EXTRA_MEM).
+ */
 #define VIRTIO_BALLOON_HCMD_SET_BALLOON	((__le64)0x8000)
 /* Ask for statistics */
 #define VIRTIO_BALLOON_HCMD_GET_STATS	((__le64)0x8001)




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]