I'm writing a Virtio-net device in Lua. I'd love to have a little feedback to help me understand whether I'm on the right track with my implementation and what details of the spec I have misunderstood.
If we're lucky then maybe we can even uncover some interesting new use cases for Virtio-net. I hope this will in some way be interesting to other people too, anyway.
I and others are developing a fully zero-copy software switch for QEMU in userspace. It's called Snabb Switch and it's for ISPs who have to shuffle millions of packets per second per box.
The zero-copy design works by creating a 1:1 mapping between Virtio-net queues and Intel VMDq hardware queues in the NIC. This means that I/O is performed by simply translating descriptors between Virtio-net and hardware-native formats without touching the packet data.
The QEMU interface is based on vhost_user (link below), a userspace clone of the Linux vhost-net kernel interface. That means the switch handles the vrings but not the PCI level.
I want to be able to hack packets "in flight" before making them available with translated descriptors. For example, to be able to perform encap/decap of tunnel headers. I'm aiming to minimize the number of bytes that have to be touched in these cases, e.g. I would like to be able to isolate the packet headers into a small iovec that's cheap to rewrite and keep the rest of the packet data in larger iovecs afterwards.
Here are the questions on my mind:
Are MERGEABLE descriptors a good choice for me? This triggers the 'mergeable_rx_bufs' behavior in the Linux driver and provides me with 4KB buffers. I believe that I can send packets to VMs using multiple buffers, and that I can say how many bytes of each buffer should be used, but cannot use offsets (data must start at the beginning of the buffer).
Are INDIRECT descriptors a good choice for me? I don't use it today and I don't know if it would bring any advantages. If it would in practice give me more buffers from the VM then that would be positive. (I seem to only get 256 buffers from qemu.)
What glaring mistakes have I made in my code based on misunderstandings of Virtio-net? I'm sure there are several!
Any advice appreciated, and thank you for reading this far. :-)