OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH 1/3] shared memory: Define shared memory regions


On 15.02.19 15:02, Dr. David Alan Gilbert wrote:
> * David Hildenbrand (david@redhat.com) wrote:
>> On 15.02.19 14:50, Dr. David Alan Gilbert wrote:
>>> * Cornelia Huck (cohuck@redhat.com) wrote:
>>>> On Fri, 15 Feb 2019 13:33:06 +0100
>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>>> On 15.02.19 13:28, Cornelia Huck wrote:
>>>>>> On Fri, 15 Feb 2019 12:26:00 +0100
>>>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>>>   
>>>>>>> Probing is always ugly. But I think we can add something like
>>>>>>>  the x86 PCI hole between 3 and 4 GB after our initial boot memory.
>>>>>>> So there, we would have a memory region just like e.g. x86 has.  
>>>>>>
>>>>>> A special region is probably the best way out of this pickle. We would
>>>>>> only need the discovery ccw for virtio, then.
>>>>>>   
>>>>>>>
>>>>>>> This should even work with other mechanism I am working on. E.g.
>>>>>>> for memory devices, we will add yet another memory region above
>>>>>>> the special PCI region.
>>>>>>>
>>>>>>> The layout of the guest would then be something like
>>>>>>>
>>>>>>> [0x000000000000000]
>>>>>>> ... Memory region containing RAM
>>>>>>> [ram_size         ]
>>>>>>> ... Memory region for e.g. special PCI devices
>>>>>>> [ram_size +1 GB   ]
>>>>>>> ... Memory region for memory devices (virtio-pmem, virtio-mem ...)
>>>>>>> [maxram_size - ram_size + 1GB]
>>>>>>>
>>>>>>> We would have to create proper page tables for guest backing that take
>>>>>>> care of the new guest size (not just ram_size). Also, to the guest we
>>>>>>> would indicate "maximum ram size == ram_size" so it does not try to
>>>>>>> probe the "special" memory.  
>>>>>>
>>>>>> Hm... so that would be:
>>>>>> - 0..ram_size: just like it is handled now
>>>>>> - ram_size..ram_size + 1GB: guest does not treat it as ram, but does
>>>>>>   build page tables for it
>>>>>> - ram_size + 1GB..maxram_size: for whatever memory devices do with it
>>>>>>
>>>>>> How does the guest probe this? (SCLP?) Or does the guest simply know
>>>>>> via some kind of probable feature that there's a 1GB region there?  
>>>>>
>>>>> As the guest only "knowns" ram, there is a "maximum ram size" specified
>>>>> via SCLP. An unmodified guest will not probe beyond that.
>>>>
>>>> Nod.
>>>>
>>>>> The parts of the 1GB used by a device should be communicated via the
>>>>> paravirtualized device I guess. PCI bars don't really fit I assume, so
>>>>> we might need some virtio-ccw thingy (you're the expert :)) on top. That
>>>>> is one part to be clarified.
>>>>>
>>>>> I guess the guest does not need to know about the whole 1GB, only per
>>>>> device about the used part. We can then built page tables in the guest
>>>>> for that part when plugging.
>>>>
>>>> Hm. With my proposal, the guest would get a list of region addresses
>>>> from the device via a new ccw. It could then proceed to set up page
>>>> tables for it and start to use it. As long as it is aware that the
>>>> addresses it will get are beyond max_ram, that should be fine, I think.
>>>
>>> Which is the same as my virtio-mmio proposal; the host gets to put it
>>> where ever it sees fit (outside ram) and you've just got a way of
>>> telling the guest where it lives.
>>>
>>> Davidh's 1GB window is pretty much how older PCs worked I think;
>>> the problem is that 1GB is never enough and you still need a way
>>> to enumarate what devices are where, so it doesn't help you.
>>> (Our current virtio-fs dax mappings we're using are a few GB).
>>>
>>
>> How does that work on x86? You cannot suddenly move stuff into the
>> memory device memory region and potentially mess with DIMMs to be
>> plugged later. QEMU wise, this sounds wrong.
> 
> Because it's PCI based, it becomes the guests problem - the guest
> sets the PCI BARs which set the GPA of the PCI devices;  I assume
> there's some protection that happens if it gets mapped over RAM (?!)
> 
> I think that varies by firmware as well, with EFI mapping
> them differently from our bios.
> I think the guest knows the total number of DIMM slots and max-ram
> limit, so knows where not-to-map.

On s390x, we have to define the size of the host->guest page table when
starting the guest. So we need some upper limit. Mapping anywhere, I
really don't like. Letting the guest define the mapping, I really don't
like.

We can of course switch the order of mappings

[0x000000000000000      ]
... Memory region containing RAM
[ram_size         	]
... Memory region for memory devices (virtio-pmem, virtio-mem ...)
[maxram_size - ram_size ]
... Memory region for e.g. special PCI/CCW devices
[                    TBD]

We can size TBD in a way that we e.g. max out the current page table
size before having to switch to more levels.

> 
> Dave


-- 

Thanks,

David / dhildenb


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]