OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: Vhost-pci RFC2.0


On 2017-04-19 10:42, Wei Wang wrote:
> On 04/19/2017 03:35 PM, Jan Kiszka wrote:
>> On 2017-04-19 08:38, Wang, Wei W wrote:
>>> Hi,
>>>   We made some design changes to the original vhost-pci design, and want
>>> to open
>>> a discussion about the latest design (labelled 2.0) and its extension
>>> (2.1).
>>> 2.0 design: One VM shares the entire memory of another VM
>>> 2.1 design: One VM uses an intermediate memory shared with another VM
>>> for
>>>                       packet transmission.
>>>   For the convenience of discussion, I have some pictures presented at
>>> this link:
>>> _https://github.com/wei-w-wang/vhost-pci-discussion/blob/master/vhost-pci-rfc2.0.pdf_
>>>
>>>   Fig. 1 shows the common driver frame that we want use to build the 2.0
>>> and 2.1
>>> design. A TX/RX engine consists of a local ring and an exotic ring.
>>> Local ring:
>>> 1) allocated by the driver itself;
>>> 2) registered with the device (i.e. virtio_add_queue())
>>> Exotic ring:
>>> 1) ring memory comes from the outside (of the driver), and exposed to
>>> the driver
>>>       via a BAR MMIO;
>> Small additional requirement: In order to make this usable with
>> Jailhouse as well, we need [also] a side-channel configuration for the
>> regions, i.e. likely via a PCI capability. There are too few BARs, and
>> they suggest relocatablity, which is not available under Jailhouse for
>> simplicity reasons (IOW, the shared regions are statically mapped by the
>> hypervisor into the affected guest address spaces).
> What kind of configuration would you need for the regions?
> I think adding a PCI capability should be easy.

Basically address and size, see
https://github.com/siemens/jailhouse/blob/wip/ivshmem2/Documentation/ivshmem-v2-specification.md#vendor-specific-capability-id-09h

> 
>>> 2) does not have a registration in the device, so no ioeventfd/irqfd,
>>> configuration
>>> registers allocated in the device
>>>   Fig. 2 shows how the driver frame is used to build the 2.0 design.
>>> 1) Asymmetric: vhost-pci-net <-> virtio-net
>>> 2) VM1 shares the entire memory of VM2, and the exotic rings are the
>>> rings
>>>      from VM2.
>>> 3) Performance (in terms of copies between VMs):
>>>      TX: 0-copy (packets are put to VM2’s RX ring directly)
>>>      RX: 1-copy (the green arrow line in the VM1’s RX engine)
>>>   Fig. 3 shows how the driver frame is used to build the 2.1 design.
>>> 1) Symmetric: vhost-pci-net <-> vhost-pci-net
>> This is interesting!
>>
>>> 2) Share an intermediate memory, allocated by VM1’s vhost-pci device,
>>> for data exchange, and the exotic rings are built on the shared memory
>>> 3) Performance:
>>>      TX: 1-copy
>>> RX: 1-copy
>> I'm not yet sure I to this right: there are two different MMIO regions
>> involved, right? One is used for VM1's RX / VM2's TX, and the other for
>> the reverse path? Would allow our requirement to have those regions
>> mapped with asymmetric permissions (RX read-only, TX read/write).
> The design presented here intends to use only one BAR to expose
> both TX and RX. The two VMs share an intermediate memory
> here, why couldn't we give the same permission to TX and RX?
> 

For security and/or safety reasons: the TX side can then safely prepare
and sign a message in-place because the RX side cannot mess around with
it while not yet being signed (or check-summed). Saves one copy from a
secure place into the shared memory.

> 
>>>   Fig. 4 shows the inter-VM notification path for 2.0 (2.1 is similar).
>>> The four eventfds are allocated by virtio-net, and shared with
>>> vhost-pci-net:
>>> Uses virtio-net’s TX/RX kickfd as the vhost-pci-net’s RX/TX callfd
>>> Uses virtio-net’s TX/RX callfd as the vhost-pci-net’s RX/TX kickfd
>>> Example of how it works:
>>> After packets are put into vhost-pci-net’s TX, the driver kicks TX,
>>> which
>>> causes the an interrupt associated with fd3 to be injected to virtio-net
>>>   The draft code of the 2.0 design is ready, and can be found here:
>>> Qemu: _https://github.com/wei-w-wang/vhost-pci-device_
>>> Guest driver: _https://github.com/wei-w-wang/vhost-pci-driver_
>>>   We tested the 2.0 implementation using the Spirent packet
>>> generator to transmit 64B packets, the results show that the
>>> throughput of vhost-pci reaches around 1.8Mpps, which is around
>>> two times larger than the legacy OVS+DPDK. Also, vhost-pci shows
>>> better scalability than OVS+DPDK.
>>>   
>> Do you have numbers for the symmetric 2.1 case as well? Or is the driver
>> not yet ready for that yet? Otherwise, I could try to make it work over
>> a simplistic vhost-pci 2.1 version in Jailhouse as well. That would give
>> a better picture of how much additional complexity this would mean
>> compared to our ivshmem 2.0.
>>
> 
> Implementation of 2.1 is not ready yet. We can extend it to 2.1 after
> the common driver frame is reviewed.

Can you you assess the needed effort?

For us, this is a critical feature, because we need to decide if
vhost-pci can be an option at all. In fact, the "exotic ring" will be
the only way to provide secure inter-partition communication on Jailhouse.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]