OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [PATCH v4 0/1] introduce virtio-ism: internal shared memory device


Hello everyone,

# Background

    Nowadays, there is a common scenario to accelerate communication between
    different VMs and containers, including light weight virtual machine based
    containers. One way to achieve this is to colocate them on the same host.
    However, the performance of inter-VM communication through network stack is
    not optimal and may also waste extra CPU cycles. This scenario has been
    discussed many times, but still no generic solution available [1] [2] [3].

    We also have a lot of such scenarios internally, except for general network
    communication, there are also many application scenarios of shared
    memory. Due to various reasons, it is difficult for us to realize these
    business data using network communication. For example, in some scenarios,
    the application needs to exchange a large amount of data with the physical
    device on the host, so shared memory is the most suitable solution.

    Shared memory is an efficient communication method, so we hope to implement
    a cross-vm shared memory method. We were inspired by the IBM ism device[4],
    we use virtio-ism to achieve memory sharing between vm on the same host.

# virtio-ism

    An ISM(Internal Shared Memory) device provides the ability to access memory
    shared between multiple devices. This allows low-overhead communication in
    presence of such memory. For example, memory can be shared with guests of
    multiple virtual machines running on the same host, with each virtual
    machine including an ism device and with the guests getting the shared
    memory by the ism devices.

    An ism device can communicate with multiple peers simultaneously. This
    communication can be dynamically started and ended.

    This is a structure diagram based on ism sharing between two vms.

    |-------------------------------------------------------------------------------------------------------------|
    | |------------------------------------------------|       |------------------------------------------------| |
    | | Guest                                          |       | Guest                                          | |
    | |                                                |       |                                                | |
    | |   ----------------                             |       |   ----------------                             | |
    | |   |    driver    |     [M1]   [M2]   [M3]      |       |   |    driver    |             [M2]   [M3]     | |
    | |   ----------------       |      |      |       |       |   ----------------               |      |      | |
    | |    |cq|                  |map   |map   |map    |       |    |cq|                          |map   |map   | |
    | |    |  |                  |      |      |       |       |    |  |                          |      |      | |
    | |    |  |                -------------------     |       |    |  |                --------------------    | |
    | |----|--|----------------|  device memory  |-----|       |----|--|----------------|  device memory   |----| |
    | |    |  |                -------------------     |       |    |  |                --------------------    | |
    | |                                |               |       |                               |                | |
    | |                                |               |       |                               |                | |
    | | Qemu                           |               |       | Qemu                          |                | |
    | |--------------------------------+---------------|       |-------------------------------+----------------| |
    |                                  |                                                       |                  |
    |                                  |                                                       |                  |
    |                                  |------------------------------+------------------------|                  |
    |                                                                 |                                           |
    |                                                                 |                                           |
    |                                                   --------------------------                                |
    |                                                    | M1 |   | M2 |   | M3 |                                 |
    |                                                   --------------------------                                |
    |                                                                                                             |
    | HOST                                                                                                        |
    ---------------------------------------------------------------------------------------------------------------

    On the top, we found that for the existing tcp network communication
    scenario, if it is replaced with smc + shared memory, a great performance
    improvement can also be obtained. And for smc, user processes just need to
    do little modification.
      - latency reduced by about 50%
      - throughput increased by about 300%
      - CPU consumption reduced by about 50%

    Since there is no particularly suitable shared memory management solution
    matches the need for SMC(See ## Comparison with existing technology), and
    virtio is the standard for communication in the virtualization world, we
    want to implement a virtio-ism device based on virtio, which can support
    on-demand memory sharing across VMs, containers or VM-container. To match
    the needs of SMC, the virtio-ism device need to support:

    1. Dynamic provision: shared memory regions are dynamically allocated and
       provisioned.
    2. Multi-region management: the shared memory is divided into regions,
       and a peer may allocate one or more regions from the same shared memory
       device.
    3. Permission control: the permission of each region can be set separately.
    4. Dynamic connection: each ism region of a device can be shared with
       different devices, eventually a device can be shared with thousands of
       devices

## Live Migration

    If two VMs is migrated from the same host to two different physical hosts,
    it is impossible to share memory, so we will not consider supporting
    migration for the time being.

# Comparison with existing technology

## ivshmem or ivshmem 2.0 of Qemu

   1. ivshmem 1.0 is a large piece of memory that can be seen by all devices
      that use this VM, so the security is not enough.

   2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only by
      all other VMs that use the ivshmem 2.0 shared memory device, which also
      does not meet our needs in terms of security.

## vhost-pci and virtiovhostuser

    1. does not support dynamic allocation
    2. one device just support connect to one vm

# Usage
    This is the usage steps by the user process.

    |                                                | user process syscall                     | driver to device
 ---|------------------------------------------------|------------------------------------------|-------------------------------
  1 | got memory and token                           | ioctl(fd, VIRTIO_ISM_IOCTL_ALLOC, &ctl)  | VIRTIO_ISM_CTRL_ALLOC_REGION
 ---|------------------------------------------------|------------------------------------------|-------------------------------
  2 | send token to peer process                     |                                          |
 ---|------------------------------------------------|------------------------------------------|-------------------------------
  3 | got shared memory(two process share the memory)| ioctl(fd, VIRTIO_ISM_IOCTL_ATTACH, &ctl) | VIRTIO_ISM_CTRL_ATTACH_REGION
 ---|------------------------------------------------|------------------------------------------|-------------------------------
  4 | notify peer process                            | ioctl(fd, VIRTIO_ISM_IOCTL_KICK)         | write notify area
 ---|------------------------------------------------|------------------------------------------|-------------------------------
  5 | receive notify from other process              | wakeup by select/epoll/....              | driver recv interrupt
 ---|------------------------------------------------|------------------------------------------|-------------------------------
  6 | release the reference to the shared memory     | ioctl(fd, VIRTIO_ISM_IOCTL_DETACH, &ctl) | VIRTIO_ISM_CTRL_DETACH_REGION
 ---|------------------------------------------------|------------------------------------------|-------------------------------

# POC CODE

    There are no functions related to eventq and perm yet.
    This implementation is for V2 version spec. So some details are not match
    this version.

    Qemu   (virtio ism device): https://github.com/fengidri/qemu/compare/7d66b74c4dd0d74d12c1d3d6de366242b13ed76d...ism-upstream-1216?expand=1
    Kernel (virtio ism driver): https://github.com/fengidri/linux-kernel-virtio-ism/compare/6f8101eb21bab480537027e62c4b17021fb7ea5d...ism-upstream-1223

    Start qemu with option "--device virtio-ism-pci,disable-legacy=on, disable-modern=off".

### User Space APP

    The ism driver provide /dev/vismX interface, allow users to use Virtio-ISM
    device in user space directly.

    Try tools/virtio/virtio-ism/virtio-ism-mmap

    Usage:
         cd tools/virtio/virtio-ism/; make
         insmode virtio-ism.ko

    case1: communicate

       vm1: ./virtio-ism-mmap alloc -> token
       vm2: ./virtio-ism-mmap attach -t <token> --write-msg AAAA --commit

       vm2 will write msg to shared memory, then notify vm1. After vm1 receive
       notify, then read from shared memory.

    case2: ping-pong test.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 pp

        1. server alloc one ism region
        2. client get the token by tcp

        3. client commit(kick) to server, server recv notify, commit(kick) to client
        4. loop #3

    case3: throughput test.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 tp

        1. server alloc one ism region
        2. client get the token by tcp

        3. client write 1M data to ism region
        4. client commit(kick) to server
        5. server recv notify, copy the data, the commit(kick) back to client
        6. loop #3-#5

    case4: throughput test with protocol defined by user.

        vm1: ./virtio-ism-mmap server
        vm2: ./virtio-ism-mmap -i 192.168.122.101 tp --polling --tp-chunks 15 --msg-size 64k -n 50000

        Used the ism region as a ring.

        In this scene, client and server are in the polling mode. Test it on
        my machine, the maximum can reach 12GBps

## About smc with virtio-ism

    At present, my colleagues are advancing the work of this area, and have
    contacted IBM's developers, but smc may need to do some modification, which
    may involve some complicated things, please give them more time.

# References

    [1] https://projectacrn.github.io/latest/tutorials/enable_ivshmem.html
    [2] https://dl.acm.org/doi/10.1145/2847562
    [3] https://hal.archives-ouvertes.fr/hal-00368622/document
    [4] Information about IBM ism device and SMC:
            1. SMC reference: https://www.ibm.com/docs/en/zos/2.5.0?topic=system-shared-memory-communications
            2. SMC-Dv2 and ISMv2 introduction: https://www.newera.com/INFO/SMCv2_Introduction_10-15-2020.pdf
            3. ISM device: https://www.ibm.com/docs/en/linux-on-systems?topic=n-ism-device-driver-1
            4. SMC protocol (including SMC-D): https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202_2.pdf
            5. SMC-D FAQ: https://www.ibm.com/support/pages/system/files/inline-files/2021-02-09-SMC-D-FAQ.pdf


If there are any problems, please point them out.
Hope to hear from you, thank you.

v4:
   1. reorganize the structure of the spec
   2. fix some problems

v3:
   1. support to apply memory from vm
   2. add query operation
   3. optimize the description of spec and enrich some details
   4. use the communication domain as a term
   5. replace gid with cdid

v2:
   1. add Attach/Detach event
   2. add Events Filter
   3. allow Alloc/Attach huge region
   4. remove host/guest terms

v1:
   1. cover letter adding explanation of ism vlan
   2. spec add gid
   3. explain the source of ideas about ism
   4. POC support virtio-ism-smc.ko virtio-ism-dev.ko and support virtio-ism-mmap



Xuan Zhuo (1):
  virtio-ism: introduce new device virtio-ism

 conformance.tex                         |   2 +
 content.tex                             |   1 +
 device-types/ism/description.tex        | 591 ++++++++++++++++++++++++
 device-types/ism/device-conformance.tex |  17 +
 device-types/ism/driver-conformance.tex |  13 +
 device-types/ism/layout-pic.tex         | 112 +++++
 virtio-html.tex                         |   9 +
 virtio.tex                              |   9 +
 8 files changed, 754 insertions(+)
 create mode 100644 device-types/ism/description.tex
 create mode 100644 device-types/ism/device-conformance.tex
 create mode 100644 device-types/ism/driver-conformance.tex
 create mode 100644 device-types/ism/layout-pic.tex

-- 
2.32.0.3.g01195cf9f



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]