[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH 1/3] shared memory: Define shared memory regions
On Fri, 15 Feb 2019 15:14:25 +0000 "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > * David Hildenbrand (david@redhat.com) wrote: > > On 15.02.19 15:02, Dr. David Alan Gilbert wrote: > > > * David Hildenbrand (david@redhat.com) wrote: > > >> On 15.02.19 14:50, Dr. David Alan Gilbert wrote: > > >>> * Cornelia Huck (cohuck@redhat.com) wrote: > > >>>> On Fri, 15 Feb 2019 13:33:06 +0100 > > >>>> David Hildenbrand <david@redhat.com> wrote: > > >>>> > > >>>>> On 15.02.19 13:28, Cornelia Huck wrote: > > >>>>>> On Fri, 15 Feb 2019 12:26:00 +0100 > > >>>>>> David Hildenbrand <david@redhat.com> wrote: > > >>>>>> > > >>>>>>> Probing is always ugly. But I think we can add something like > > >>>>>>> the x86 PCI hole between 3 and 4 GB after our initial boot memory. > > >>>>>>> So there, we would have a memory region just like e.g. x86 has. > > >>>>>> > > >>>>>> A special region is probably the best way out of this pickle. We would > > >>>>>> only need the discovery ccw for virtio, then. > > >>>>>> > > >>>>>>> > > >>>>>>> This should even work with other mechanism I am working on. E.g. > > >>>>>>> for memory devices, we will add yet another memory region above > > >>>>>>> the special PCI region. > > >>>>>>> > > >>>>>>> The layout of the guest would then be something like > > >>>>>>> > > >>>>>>> [0x000000000000000] > > >>>>>>> ... Memory region containing RAM > > >>>>>>> [ram_size ] > > >>>>>>> ... Memory region for e.g. special PCI devices > > >>>>>>> [ram_size +1 GB ] > > >>>>>>> ... Memory region for memory devices (virtio-pmem, virtio-mem ...) > > >>>>>>> [maxram_size - ram_size + 1GB] > > >>>>>>> > > >>>>>>> We would have to create proper page tables for guest backing that take > > >>>>>>> care of the new guest size (not just ram_size). Also, to the guest we > > >>>>>>> would indicate "maximum ram size == ram_size" so it does not try to > > >>>>>>> probe the "special" memory. > > >>>>>> > > >>>>>> Hm... so that would be: > > >>>>>> - 0..ram_size: just like it is handled now > > >>>>>> - ram_size..ram_size + 1GB: guest does not treat it as ram, but does > > >>>>>> build page tables for it > > >>>>>> - ram_size + 1GB..maxram_size: for whatever memory devices do with it > > >>>>>> > > >>>>>> How does the guest probe this? (SCLP?) Or does the guest simply know > > >>>>>> via some kind of probable feature that there's a 1GB region there? > > >>>>> > > >>>>> As the guest only "knowns" ram, there is a "maximum ram size" specified > > >>>>> via SCLP. An unmodified guest will not probe beyond that. > > >>>> > > >>>> Nod. > > >>>> > > >>>>> The parts of the 1GB used by a device should be communicated via the > > >>>>> paravirtualized device I guess. PCI bars don't really fit I assume, so > > >>>>> we might need some virtio-ccw thingy (you're the expert :)) on top. That > > >>>>> is one part to be clarified. > > >>>>> > > >>>>> I guess the guest does not need to know about the whole 1GB, only per > > >>>>> device about the used part. We can then built page tables in the guest > > >>>>> for that part when plugging. > > >>>> > > >>>> Hm. With my proposal, the guest would get a list of region addresses > > >>>> from the device via a new ccw. It could then proceed to set up page > > >>>> tables for it and start to use it. As long as it is aware that the > > >>>> addresses it will get are beyond max_ram, that should be fine, I think. > > >>> > > >>> Which is the same as my virtio-mmio proposal; the host gets to put it > > >>> where ever it sees fit (outside ram) and you've just got a way of > > >>> telling the guest where it lives. > > >>> > > >>> Davidh's 1GB window is pretty much how older PCs worked I think; > > >>> the problem is that 1GB is never enough and you still need a way > > >>> to enumarate what devices are where, so it doesn't help you. > > >>> (Our current virtio-fs dax mappings we're using are a few GB). > > >>> > > >> > > >> How does that work on x86? You cannot suddenly move stuff into the > > >> memory device memory region and potentially mess with DIMMs to be > > >> plugged later. QEMU wise, this sounds wrong. > > > > > > Because it's PCI based, it becomes the guests problem - the guest > > > sets the PCI BARs which set the GPA of the PCI devices; I assume > > > there's some protection that happens if it gets mapped over RAM (?!) > > > > > > I think that varies by firmware as well, with EFI mapping > > > them differently from our bios. > > > I think the guest knows the total number of DIMM slots and max-ram > > > limit, so knows where not-to-map. > > > > On s390x, we have to define the size of the host->guest page table when > > starting the guest. So we need some upper limit. > > That's OK; x86 also has that because they have a limited physical > and virtual address size [which may or may not be correctly passed to > the guest!]. > > > Mapping anywhere, I > > really don't like. Letting the guest define the mapping, I really don't > > like. > > Well it's OK to have a hole for it, but letting the guest choose where > those mappings go in the hole is the norm for PCI (there are > exceptions). > > > We can of course switch the order of mappings > > > > [0x000000000000000 ] > > ... Memory region containing RAM > > [ram_size ] > > ... Memory region for memory devices (virtio-pmem, virtio-mem ...) > > [maxram_size - ram_size ] > > ... Memory region for e.g. special PCI/CCW devices > > [ TBD] > > > > We can size TBD in a way that we e.g. max out the current page table > > size before having to switch to more levels. > > Yes, that's fine to set some upper limit; you've just got to make sure > that the hypervisor knows where it can put stuff and if the guest > does PCI that it knows where it's allowed to put stuff and as long > as the two don't overlap everyone is happy. > > [We should probably take this level of detail off this list - it's > parsecs away from the detail of virtio] If you do take the in detail discussion off is list please keep me in the loop. Regards, Halil
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]