OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: Re: Re: [virtio-comment] [PROPOSAL] Virtio Over Fabrics(TCP/RDMA)




On 4/25/23 14:36, Jason Wang wrote:
On Mon, Apr 24, 2023 at 9:38âPM zhenwei pi <pizhenwei@bytedance.com> wrote:



On 4/24/23 11:40, Jason Wang wrote:
On Sun, Apr 23, 2023 at 7:31âPM zhenwei pi <pizhenwei@bytedance.com> wrote:

Hi,

In the past years, virtio supports lots of device specifications by
PCI/MMIO/CCW. These devices work fine in the virtualization environment,
and we have a chance to support virtio device family for the
container/host scenario.

PCI can work for containers for sure (or does it meet any issue like
scalability?). It's better to describe what problems you met and why
you choose this way to solve it.

It's better to compare this with

1) hiding the fabrics details via DPU
2) vDPA

Hi,

Sorry, I missed this part. "Network defined peripheral devices of virtio
family" is the main purpose of this proposal,

This can be achieved by either DPU or vDPA. I think the advantages is,
if we standardize this in the spec, it avoids vendor specific
protocol.


Agree with "standardize this in the spec to avoid vendor specific protocol", this also avoid device specific protocol, for example, we don't need "block over fabrics", "crypto over fabrics", "camera over fabrics" ... to support virtio device family, instead we only need a single virtio-of.

"This can be achieved by either DPU or vDPA", do you mean case 3?

this allows us to use many
types of remote resources which are provided by virtio target.

  From the point of my view, there are 3 cases:
1, Host/container scenario. For example, host kernel connects a virtio
target block service, maps it as a vdx(virtio-blk) device(used by
Map-Reduce service which needs a fast/large size disk). The host kernel
also connects a virtio target crypto service, maps it as virtio crypto
device(used by nginx to accelarate HTTPS). And so on.

          +----------+    +----------+       +----------+
          |Map-Reduce|    |   nginx  |  ...  | processes|
          +----------+    +----------+       +----------+
------------------------------------------------------------
Host         |               |                  |
Kernel   +-------+       +-------+          +-------+
           | ext4  |       | LKCF  |          | HWRNG |
           +-------+       +-------+          +-------+
               |               |                  |
           +-------+       +-------+          +-------+
           |  vdx  |       |vCrypto|          | vRNG  |
           +-------+       +-------+          +-------+
               |               |                  |
               |           +--------+             |
               +---------->|TCP/RDMA|<------------+
                           +--------+
                               |
                           +------+
                           |NIC/IB|
                           +------+
                               |                      +-------------+
                               +--------------------->|virtio target|
                                                      +-------------+

2, Typical virtualization environment. The workloads run in a guest, and
QEMU handles virtio-pci(or MMIO), and forwards requests to target.
          +----------+    +----------+       +----------+
          |Map-Reduce|    |   nginx  |  ...  | processes|
          +----------+    +----------+       +----------+
------------------------------------------------------------
Guest        |               |                  |
Kernel   +-------+       +-------+          +-------+
           | ext4  |       | LKCF  |          | HWRNG |
           +-------+       +-------+          +-------+
               |               |                  |
           +-------+       +-------+          +-------+
           |  vdx  |       |vCrypto|          | vRNG  |
           +-------+       +-------+          +-------+
               |               |                  |
PCI --------------------------------------------------------
                               |
QEMU                 +--------------+
                       |virtio backend|
                       +--------------+
                               |
                           +------+
                           |NIC/IB|
                           +------+
                               |                      +-------------+
                               +--------------------->|virtio target|
                                                      +-------------+


So it's the job of the Qemu to do the translation from virtqueue to packet here?


Yes. QEMU pops request from virtqueue backend, translates it into virtio-of, and forwards to target side. Handle response from target side, then interrupt guest.

3, SmartNIC/DPU/vDPA environment. It's possible to convert virtio-pci
request to virtio-of request by hardware, and forward request to virtio
target directly.
          +----------+    +----------+       +----------+
          |Map-Reduce|    |   nginx  |  ...  | processes|
          +----------+    +----------+       +----------+
------------------------------------------------------------
Host         |               |                  |
Kernel   +-------+       +-------+          +-------+
           | ext4  |       | LKCF  |          | HWRNG |
           +-------+       +-------+          +-------+
               |               |                  |
           +-------+       +-------+          +-------+
           |  vdx  |       |vCrypto|          | vRNG  |
           +-------+       +-------+          +-------+
               |               |                  |
PCI --------------------------------------------------------
                               |
SmartNIC             +---------------+
                       |virtio HW queue|
                       +---------------+
                               |
                           +------+
                           |NIC/IB|
                           +------+
                               |                      +-------------+
                               +--------------------->|virtio target|
                                                      +-------------+


- Theory
"Virtio Over Fabrics" aims at "reuse virtio device specifications", and
provides network defined peripheral devices.
And this protocol also could be used in virtualization environment,
typically hypervisor(or vhost-user process) handles request from virtio
PCI/MMIO/CCW, remaps request and forwards to target by fabrics.

This requires meditation in the datapath, isn't it?


- Protocol
The detail protocol definition see:
https://github.com/pizhenwei/linux/blob/virtio-of-github/include/uapi/linux/virtio_of.h

I'd say a RFC patch for virtio spec is more suitable than the codes.


OK. I'll send a RFC patch for virtio spec later if this proposal is
acceptable.

Well, I think we need to have an RFC first to know if it is acceptable or not.


Sure, I'll send a draft.


[...]


A quick glance at the code told me it's a mediation layer that convert
descriptors in the vring to the fabric specific packet. This is the
vDPA way.

If we agree virtio of fabic is useful, we need invent facilities to
allow building packet directly without bothering the virtqueue (the
API is layout independent anyhow).

Thanks


This code describes the case 1[Host/container scenario], also proves
this case works.
Create a virtqueue in the virtio fabric module, also emulate a
"virtqueue backend" here, when uplayer kicks vring, the "backend" gets
notified and builds packet to TCP/RDMA.

In this case, it won't perform good. Since it still use virtqueue
which is unnecessary in the datapath for fabric.
 > Thanks


[...]

--
zhenwei pi



--
zhenwei pi


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]