OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] [PATCH v3 2/2] virtio-fs: add DAX window


On Thu, Jun 27, 2019 at 10:09:16AM -0400, Michael S. Tsirkin wrote:
> On Tue, Jun 25, 2019 at 10:55:15AM +0100, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (mst@redhat.com) wrote:
> > > On Mon, Jun 24, 2019 at 02:58:08PM +0100, Stefan Hajnoczi wrote:
> > > > On Tue, Jun 18, 2019 at 09:41:25PM -0400, Michael S. Tsirkin wrote:
> > > > > On Wed, Feb 20, 2019 at 12:46:13PM +0000, Stefan Hajnoczi wrote:
> > > > > > +
> > > > > > +\devicenormative{\paragraph}{Device Operation: DAX Window}{Device Types / File System Device / Device Operation / Device Operation: DAX Window}
> > > > > > +
> > > > > > +The device MUST allow mappings that completely or partially overlap existing mappings within the DAX window.
> > > > > 
> > > > > 
> > > > > Any alignment requirements?
> > > > 
> > > > Good point.  There are alignment requirements and the driver has no way
> > > > of knowing what they are.  I'll find a way to communicate them into the
> > > > guest, either via virtio or via FUSE.
> > > > 
> > > > > Also, with no limit on mappings, it looks like guest can use up lots of
> > > > > host VMAs quickly. Shouldn't there be a limit on # of mappings?
> > > > 
> > > > The VM can only deteriorate its own performance, right?
> > > 
> > > Only if QEMU is put in a container where virtual memory is
> > > limited.
> > > It's generally not a good idea where the only way for
> > > host to make progress is to allocate more memory
> > > without any limit.
> > > 
> > > If we are in a situation where we need to either kill
> > > the guest or hit swap, none of the choices is good.
> > 
> > There is a bound; it's cache region size / page size - so
> > that's ~1M mappings worst case (e.g. 4GB cache, 4kB page size)
> > That limit can be bought down if we impose a larger granularity
> > somewhere (and the reality is our kernel uses 2MB mapping chunks I
> > think).
> > 
> > > > We haven't seen catastrophic problems that bring the system to it's
> > > > knees.
> > > 
> > > Because you are not running malicious guests?
> > 
> > Hmm, I didn't realise a process having an excessive number of mappings
> > could harm any other process.
> > 
> > Dave
> 
> Well it allocates resources on the host. If you don't
> contain qemu then even just allocating virtual memory
> can make host swap, right? If you contain it then
> qemu will get killed instead but then you need to tell
> guest what not to do so as not to get qemu killed.

I investigated a little.  Linux has a maximum VMA count sysctl that is
affected by mmap and any other places that add/split VMAs:

  vm.max_map_count = 65530

This is a sysctl tunable and is kept below 65536 for legacy reasons.
ELF coredumps used to only support ~65536 sections.

The QEMU process needs its own VMAs for shared libraries and other
purposes, so each virtio-fs device should expose a significantly lower
DAX Window mapping limit to the driver.  Let's add a configuration space
field as Michael has suggested.

Regarding denial of service, the DAX Window size determines the overall
amount of host page cache that is accessible by the driver.  Together
with an enforced maximum map count we can allow the administrator to
configure devices so they only provide access to a fraction of the host
page cache, mitigating denial of service issues.

Stefan

Attachment: signature.asc
Description: PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]