OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH 1/3] shared memory: Define shared memory regions


* David Hildenbrand (david@redhat.com) wrote:
> On 15.02.19 15:02, Dr. David Alan Gilbert wrote:
> > * David Hildenbrand (david@redhat.com) wrote:
> >> On 15.02.19 14:50, Dr. David Alan Gilbert wrote:
> >>> * Cornelia Huck (cohuck@redhat.com) wrote:
> >>>> On Fri, 15 Feb 2019 13:33:06 +0100
> >>>> David Hildenbrand <david@redhat.com> wrote:
> >>>>
> >>>>> On 15.02.19 13:28, Cornelia Huck wrote:
> >>>>>> On Fri, 15 Feb 2019 12:26:00 +0100
> >>>>>> David Hildenbrand <david@redhat.com> wrote:
> >>>>>>   
> >>>>>>> Probing is always ugly. But I think we can add something like
> >>>>>>>  the x86 PCI hole between 3 and 4 GB after our initial boot memory.
> >>>>>>> So there, we would have a memory region just like e.g. x86 has.  
> >>>>>>
> >>>>>> A special region is probably the best way out of this pickle. We would
> >>>>>> only need the discovery ccw for virtio, then.
> >>>>>>   
> >>>>>>>
> >>>>>>> This should even work with other mechanism I am working on. E.g.
> >>>>>>> for memory devices, we will add yet another memory region above
> >>>>>>> the special PCI region.
> >>>>>>>
> >>>>>>> The layout of the guest would then be something like
> >>>>>>>
> >>>>>>> [0x000000000000000]
> >>>>>>> ... Memory region containing RAM
> >>>>>>> [ram_size         ]
> >>>>>>> ... Memory region for e.g. special PCI devices
> >>>>>>> [ram_size +1 GB   ]
> >>>>>>> ... Memory region for memory devices (virtio-pmem, virtio-mem ...)
> >>>>>>> [maxram_size - ram_size + 1GB]
> >>>>>>>
> >>>>>>> We would have to create proper page tables for guest backing that take
> >>>>>>> care of the new guest size (not just ram_size). Also, to the guest we
> >>>>>>> would indicate "maximum ram size == ram_size" so it does not try to
> >>>>>>> probe the "special" memory.  
> >>>>>>
> >>>>>> Hm... so that would be:
> >>>>>> - 0..ram_size: just like it is handled now
> >>>>>> - ram_size..ram_size + 1GB: guest does not treat it as ram, but does
> >>>>>>   build page tables for it
> >>>>>> - ram_size + 1GB..maxram_size: for whatever memory devices do with it
> >>>>>>
> >>>>>> How does the guest probe this? (SCLP?) Or does the guest simply know
> >>>>>> via some kind of probable feature that there's a 1GB region there?  
> >>>>>
> >>>>> As the guest only "knowns" ram, there is a "maximum ram size" specified
> >>>>> via SCLP. An unmodified guest will not probe beyond that.
> >>>>
> >>>> Nod.
> >>>>
> >>>>> The parts of the 1GB used by a device should be communicated via the
> >>>>> paravirtualized device I guess. PCI bars don't really fit I assume, so
> >>>>> we might need some virtio-ccw thingy (you're the expert :)) on top. That
> >>>>> is one part to be clarified.
> >>>>>
> >>>>> I guess the guest does not need to know about the whole 1GB, only per
> >>>>> device about the used part. We can then built page tables in the guest
> >>>>> for that part when plugging.
> >>>>
> >>>> Hm. With my proposal, the guest would get a list of region addresses
> >>>> from the device via a new ccw. It could then proceed to set up page
> >>>> tables for it and start to use it. As long as it is aware that the
> >>>> addresses it will get are beyond max_ram, that should be fine, I think.
> >>>
> >>> Which is the same as my virtio-mmio proposal; the host gets to put it
> >>> where ever it sees fit (outside ram) and you've just got a way of
> >>> telling the guest where it lives.
> >>>
> >>> Davidh's 1GB window is pretty much how older PCs worked I think;
> >>> the problem is that 1GB is never enough and you still need a way
> >>> to enumarate what devices are where, so it doesn't help you.
> >>> (Our current virtio-fs dax mappings we're using are a few GB).
> >>>
> >>
> >> How does that work on x86? You cannot suddenly move stuff into the
> >> memory device memory region and potentially mess with DIMMs to be
> >> plugged later. QEMU wise, this sounds wrong.
> > 
> > Because it's PCI based, it becomes the guests problem - the guest
> > sets the PCI BARs which set the GPA of the PCI devices;  I assume
> > there's some protection that happens if it gets mapped over RAM (?!)
> > 
> > I think that varies by firmware as well, with EFI mapping
> > them differently from our bios.
> > I think the guest knows the total number of DIMM slots and max-ram
> > limit, so knows where not-to-map.
> 
> On s390x, we have to define the size of the host->guest page table when
> starting the guest. So we need some upper limit.

That's OK; x86 also has that because they have a limited physical
and virtual address size [which may or may not be correctly passed to
the guest!].

> Mapping anywhere, I
> really don't like. Letting the guest define the mapping, I really don't
> like.

Well it's OK to have a hole for it, but letting the guest choose where
those mappings go in the hole is the norm for PCI (there are
exceptions).

> We can of course switch the order of mappings
> 
> [0x000000000000000      ]
> ... Memory region containing RAM
> [ram_size         	]
> ... Memory region for memory devices (virtio-pmem, virtio-mem ...)
> [maxram_size - ram_size ]
> ... Memory region for e.g. special PCI/CCW devices
> [                    TBD]
> 
> We can size TBD in a way that we e.g. max out the current page table
> size before having to switch to more levels.

Yes, that's fine to set some upper limit; you've just got to make sure
that the hypervisor knows where it can put stuff and if the guest
does PCI that it knows where it's allowed to put stuff and as long
as the two don't overlap everyone is happy.

[We should probably take this level of detail off this list - it's
parsecs away from the detail of virtio]

Dave

> > 
> > Dave
> 
> 
> -- 
> 
> Thanks,
> 
> David / dhildenb
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]