OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PROPOSAL] Virtio Over Fabrics(TCP/RDMA)



å 2023/4/25 13:03, Parav Pandit åé:


On 4/24/2023 9:38 AM, zhenwei pi wrote:


ÂFrom the point of my view, there are 3 cases:
1, Host/container scenario. For example, host kernel connects a virtio target block service, maps it as a vdx(virtio-blk) device(used by Map-Reduce service which needs a fast/large size disk). The host kernel also connects a virtio target crypto service, maps it as virtio crypto device(used by nginx to accelarate HTTPS). And so on.

ÂÂÂÂÂÂÂÂ +----------+ÂÂÂ +----------+ÂÂÂÂÂÂ +----------+
 |Map-Reduce| | nginx | ... | processes|
ÂÂÂÂÂÂÂÂ +----------+ÂÂÂ +----------+ÂÂÂÂÂÂ +----------+
------------------------------------------------------------
HostÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
KernelÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂ | ext4Â |ÂÂÂÂÂÂ | LKCFÂ |ÂÂÂÂÂÂÂÂÂ | HWRNG |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
 | vdx | |vCrypto| | vRNG |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂ +--------+ÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂÂÂÂÂ +---------->|TCP/RDMA|<------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +--------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |NIC/IB|
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | +-------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +--------------------->|virtio target|
+-------------+

2, Typical virtualization environment. The workloads run in a guest, and QEMU handles virtio-pci(or MMIO), and forwards requests to target.
ÂÂÂÂÂÂÂÂ +----------+ÂÂÂ +----------+ÂÂÂÂÂÂ +----------+
 |Map-Reduce| | nginx | ... | processes|
ÂÂÂÂÂÂÂÂ +----------+ÂÂÂ +----------+ÂÂÂÂÂÂ +----------+
------------------------------------------------------------
GuestÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
KernelÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂ | ext4Â |ÂÂÂÂÂÂ | LKCFÂ |ÂÂÂÂÂÂÂÂÂ | HWRNG |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
 | vdx | |vCrypto| | vRNG |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
PCI --------------------------------------------------------
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
QEMUÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +--------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |virtio backend|
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +--------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |NIC/IB|
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | +-------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +--------------------->|virtio target|
+-------------+

Example #3 enables to implement virtio backend over fabrics initiator in the user space, which is also a good use case.
It can be also be done in non native virtio backend.
More below.

3, SmartNIC/DPU/vDPA environment. It's possible to convert virtio-pci request to virtio-of request by hardware, and forward request to virtio target directly.
ÂÂÂÂÂÂÂÂ +----------+ÂÂÂ +----------+ÂÂÂÂÂÂ +----------+
 |Map-Reduce| | nginx | ... | processes|
ÂÂÂÂÂÂÂÂ +----------+ÂÂÂ +----------+ÂÂÂÂÂÂ +----------+
------------------------------------------------------------
HostÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
KernelÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂ | ext4Â |ÂÂÂÂÂÂ | LKCFÂ |ÂÂÂÂÂÂÂÂÂ | HWRNG |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
 | vdx | |vCrypto| | vRNG |
ÂÂÂÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂ +-------+ÂÂÂÂÂÂÂÂÂ +-------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
PCI --------------------------------------------------------
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
SmartNICÂÂÂÂÂÂÂÂÂÂÂÂ +---------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |virtio HW queue|
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +---------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |NIC/IB|
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | +-------------+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ +--------------------->|virtio target|
+-------------+

All 3 seems a valid use cases.

Use case 1 and 2 can be achieved directly without involving any mediation layer or any other translation layer (for example virtio to nfs).


Not for at least use case 2? It said it has a virtio backend in Qemu. Or the only possible way is to have virtio of in the guest.

Thanks


Many blk and file protocols outside of the virtio already exists which achieve this. I don't see virtio being any different to support this in native manner, mainly the blk, fs, crypto device.

use case #3 brings additional a benefits at the same time different complexity but sure #3 is also a valid and common use case in our experiences.

In my experience working with FC, iSCSI, FCoE, NVMe RDMA fabrics, iSER,
A virito fabrics needs a lot of work to reach the scale, resiliency and lastly the security. (step by step...)

My humble suggestion is : pick one transport instead of all at once, rdma being most performant probably the first candidate to see the perf gain for use case #1 and #2 from remote system.

I briefly see your rdma command descriptor example, which is not aligned to 16B. Perf wise it will be poor than nvme rdma fabrics.

For PCI transport for net, we intent to start the work to improve descriptors, the transport binding for net device. From our research I see that some abstract virtio descriptors are great today, but if you want to get best out of the system (sw, hw, cpu), such abstraction is not the best. Sharing of "id" all the way to target and bring back is an example of such inefficiency in your example.




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]