OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device


On Wed, 19 Oct 2022 16:03:42 +0800, Gerry <gerry@linux.alibaba.com> wrote:
>
>
> > 2022å10æ19æ 16:01ïJason Wang <jasowang@redhat.com> åéï
> >
> > On Wed, Oct 19, 2022 at 3:00 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >>
> >> On Tue, 18 Oct 2022 14:54:22 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >>> On Mon, Oct 17, 2022 at 8:31 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >>>>
> >>>> On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >>>>> Adding Stefan.
> >>>>>
> >>>>>
> >>>>> On Mon, Oct 17, 2022 at 3:47 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >>>>>>
> >>>>>> Hello everyone,
> >>>>>>
> >>>>>> # Background
> >>>>>>
> >>>>>> Nowadays, there is a common scenario to accelerate communication between
> >>>>>> different VMs and containers, including light weight virtual machine based
> >>>>>> containers. One way to achieve this is to colocate them on the same host.
> >>>>>> However, the performance of inter-VM communication through network stack is not
> >>>>>> optimal and may also waste extra CPU cycles. This scenario has been discussed
> >>>>>> many times, but still no generic solution available [1] [2] [3].
> >>>>>>
> >>>>>> With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
> >>>>>> We found that by changing the communication channel between VMs from TCP to SMC
> >>>>>> with shared memory, we can achieve superior performance for a common
> >>>>>> socket-based application[5]:
> >>>>>>  - latency reduced by about 50%
> >>>>>>  - throughput increased by about 300%
> >>>>>>  - CPU consumption reduced by about 50%
> >>>>>>
> >>>>>> Since there is no particularly suitable shared memory management solution
> >>>>>> matches the need for SMC(See ## Comparison with existing technology), and virtio
> >>>>>> is the standard for communication in the virtualization world, we want to
> >>>>>> implement a virtio-ism device based on virtio, which can support on-demand
> >>>>>> memory sharing across VMs, containers or VM-container. To match the needs of SMC,
> >>>>>> the virtio-ism device need to support:
> >>>>>>
> >>>>>> 1. Dynamic provision: shared memory regions are dynamically allocated and
> >>>>>>   provisioned.
> >>>>>> 2. Multi-region management: the shared memory is divided into regions,
> >>>>>>   and a peer may allocate one or more regions from the same shared memory
> >>>>>>   device.
> >>>>>> 3. Permission control: The permission of each region can be set seperately.
> >>>>>
> >>>>> Looks like virtio-ROCE
> >>>>>
> >>>>> https://lore.kernel.org/all/20220511095900.343-1-xieyongji@bytedance.com/T/
> >>>>>
> >>>>> and virtio-vhost-user can satisfy the requirement?
> >>>>>
> >>>>>>
> >>>>>> # Virtio ism device
> >>>>>>
> >>>>>> ISM devices provide the ability to share memory between different guests on a
> >>>>>> host. A guest's memory got from ism device can be shared with multiple peers at
> >>>>>> the same time. This shared relationship can be dynamically created and released.
> >>>>>>
> >>>>>> The shared memory obtained from the device is divided into multiple ism regions
> >>>>>> for share. ISM device provides a mechanism to notify other ism region referrers
> >>>>>> of content update events.
> >>>>>>
> >>>>>> # Usage (SMC as example)
> >>>>>>
> >>>>>> Maybe there is one of possible use cases:
> >>>>>>
> >>>>>> 1. SMC calls the interface ism_alloc_region() of the ism driver to return the
> >>>>>>   location of a memory region in the PCI space and a token.
> >>>>>> 2. The ism driver mmap the memory region and return to SMC with the token
> >>>>>> 3. SMC passes the token to the connected peer
> >>>>>> 3. the peer calls the ism driver interface ism_attach_region(token) to
> >>>>>>   get the location of the PCI space of the shared memory
> >>>>>>
> >>>>>>
> >>>>>> # About hot plugging of the ism device
> >>>>>>
> >>>>>>   Hot plugging of devices is a heavier, possibly failed, time-consuming, and
> >>>>>>   less scalable operation. So, we don't plan to support it for now.
> >>>>>>
> >>>>>> # Comparison with existing technology
> >>>>>>
> >>>>>> ## ivshmem or ivshmem 2.0 of Qemu
> >>>>>>
> >>>>>>   1. ivshmem 1.0 is a large piece of memory that can be seen by all devices that
> >>>>>>   use this VM, so the security is not enough.
> >>>>>>
> >>>>>>   2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by all
> >>>>>>   other VMs that use the ivshmem 2.0 shared memory device, which also does not
> >>>>>>   meet our needs in terms of security.
> >>>>>>
> >>>>>> ## vhost-pci and virtiovhostuser
> >>>>>>
> >>>>>>   Does not support dynamic allocation and therefore not suitable for SMC.
> >>>>>
> >>>>> I think this is an implementation issue, we can support VHOST IOTLB
> >>>>> message then the regions could be added/removed on demand.
> >>>>
> >>>>
> >>>> 1. After the attacker connects with the victim, if the attacker does not
> >>>>   dereference memory, the memory will be occupied under virtiovhostuser. In the
> >>>>   case of ism devices, the victim can directly release the reference, and the
> >>>>   maliciously referenced region only occupies the attacker's resources
> >>>
> >>> Let's define the security boundary here. E.g do we trust the device or
> >>> not? If yes, in the case of virtiovhostuser, can we simple do
> >>> VHOST_IOTLB_UNMAP then we can safely release the memory from the
> >>> attacker.
> >>>
> >>>>
> >>>> 2. The ism device of a VM can be shared with multiple (1000+) VMs at the same
> >>>>   time, which is a challenge for virtiovhostuser
> >>>
> >>> Please elaborate more the the challenges, anything make
> >>> virtiovhostuser different?
> >>
> >> I understand (please point out any mistakes), one vvu device corresponds to one
> >> vm. If we share memory with 1000 vm, do we have 1000 vvu devices?
> >
> > There could be some misunderstanding here. With 1000 VM, you still
> > need 1000 virtio-sim devices I think.
> We are trying to achieve one virtio-ism device per vm instead of one virtio-ism device per SMC connection.


Already done.


>
> >
> >>
> >>
> >>>
> >>>>
> >>>> 3. The sharing relationship of ism is dynamically increased, and virtiovhostuser
> >>>>   determines the sharing relationship at startup.
> >>>
> >>> Not necessarily with IOTLB API?
> >>
> >> Unlike virtio-vhost-user, which shares the memory of a vm with another vm, we
> >> provide the same memory on the host to two vms. So the implementation of this
> >> part will be much simpler. This is why we gave up virtio-vhost-user at the
> >> beginning.
> >
> > Ok, just to make sure we're at the same page. From spec level,
> > virtio-vhost-user doesn't (can't) limit the backend to be implemented
> > in another VM. So it should be ok to be used for sharing memory
> > between a guest and host.
> >
> > Thanks
> >
> >>
> >> Thanks.
> >>
> >>
> >>>
> >>>>
> >>>> 4. For security issues, the device under virtiovhostuser may mmap more memory,
> >>>>   while ism only maps one region to other devices
> >>>
> >>> With VHOST_IOTLB_MAP, the map could be done per region.
> >>>
> >>> Thanks
> >>>
> >>>>
> >>>> Thanks.
> >>>>
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>>
> >>>>>> # Design
> >>>>>>
> >>>>>>   This is a structure diagram based on ism sharing between two vms.
> >>>>>>
> >>>>>>    |-------------------------------------------------------------------------------------------------------------|
> >>>>>>    | |------------------------------------------------|       |------------------------------------------------| |
> >>>>>>    | | Guest                                          |       | Guest                                          | |
> >>>>>>    | |                                                |       |                                                | |
> >>>>>>    | |   ----------------                             |       |   ----------------                             | |
> >>>>>>    | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
> >>>>>>    | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
> >>>>>>    | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
> >>>>>>    | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
> >>>>>>    | |    |  |                -------------------     |       |    |  |                --------------------    | |
> >>>>>>    | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
> >>>>>>    | |    |  |                -------------------     |       |    |  |                --------------------    | |
> >>>>>>    | |                                |               |       |                               |                | |
> >>>>>>    | |                                |               |       |                               |                | |
> >>>>>>    | | Qemu                           |               |       | Qemu                          |                | |
> >>>>>>    | |--------------------------------+---------------|       |-------------------------------+----------------| |
> >>>>>>    |                                  |                                                       |                  |
> >>>>>>    |                                  |                                                       |                  |
> >>>>>>    |                                  |------------------------------+------------------------|                  |
> >>>>>>    |                                                                 |                                           |
> >>>>>>    |                                                                 |                                           |
> >>>>>>    |                                                   --------------------------                                |
> >>>>>>    |                                                    | M1 |   | M2 |   | M3 |                                 |
> >>>>>>    |                                                   --------------------------                                |
> >>>>>>    |                                                                                                             |
> >>>>>>    | HOST                                                                                                        |
> >>>>>>    ---------------------------------------------------------------------------------------------------------------
> >>>>>>
> >>>>>> # POC code
> >>>>>>
> >>>>>>   Kernel: https://github.com/fengidri/linux-kernel-virtio-ism/commits/ism
> >>>>>>   Qemu:   https://github.com/fengidri/qemu/commits/ism
> >>>>>>
> >>>>>> If there are any problems, please point them out.
> >>>>>>
> >>>>>> Hope to hear from you, thank you.
> >>>>>>
> >>>>>> [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
> >>>>>> [2] https://dl.acm.org/doi/10.1145/2847562
> >>>>>> [3] https://hal.archives-ouvertes.fr/hal-00368622/document
> >>>>>> [4] https://lwn.net/Articles/711071/
> >>>>>> [5] https://lore.kernel.org/netdev/20220720170048.20806-1-tonylu@linux.alibaba.com/T/
> >>>>>>
> >>>>>>
> >>>>>> Xuan Zhuo (2):
> >>>>>>  Reserve device id for ISM device
> >>>>>>  virtio-ism: introduce new device virtio-ism
> >>>>>>
> >>>>>> content.tex    |   3 +
> >>>>>> virtio-ism.tex | 340 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>> 2 files changed, 343 insertions(+)
> >>>>>> create mode 100644 virtio-ism.tex
> >>>>>>
> >>>>>> --
> >>>>>> 2.32.0.3.g01195cf9f
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> >>>>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >>>>>>
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> >>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >>>>
> >>>
> >>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]