[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] Re: [PATCH v2 00/16] Vhost-pci for inter-VM communication
On 2017年05月20日 00:49, Michael S. Tsirkin wrote:
On Fri, May 19, 2017 at 11:10:33AM +0800, Jason Wang wrote:On 2017年05月18日 11:03, Wei Wang wrote:On 05/17/2017 02:22 PM, Jason Wang wrote:On 2017年05月17日 14:16, Jason Wang wrote:On 2017年05月16日 15:12, Wei Wang wrote:Hi: Care to post the driver codes too?OK. It may take some time to clean up the driver code before post it out. You can first have a check of the draft at the repo here: https://github.com/wei-w-wang/vhost-pci-driver Best, WeiInteresting, looks like there's one copy on tx side. We used to have zerocopy support for tun for VM2VM traffic. Could you please try to compare it with your vhost-pci-net by:We can analyze from the whole data path - from VM1's network stack to send packets -> VM2's network stack to receive packets. The number of copies are actually the same for both.That's why I'm asking you to compare the performance. The only reason for vhost-pci is performance. You should prove it.vhost-pci: 1-copy happen in VM1's driver xmit(), which copes packets from its network stack to VM2's RX ring buffer. (we call it "zerocopy" because there is no intermediate copy between VMs) zerocopy enabled vhost-net: 1-copy happen in tun's recvmsg, which copies packets from VM1's TX ring buffer to VM2's RX ring buffer.Actually, there's a major difference here. You do copy in guest which consumes time slice of vcpu thread on host. Vhost_net do this in its own thread. So I feel vhost_net is even faster here, maybe I was wrong.Yes but only if you have enough CPUs. The point of vhost-pci is to put the switch in a VM and scale better with # of VMs.
Does the overall performance really increase? I suspect the only thing vhost-pci gains here is probably scheduling cost and copying in guest should be slower than doing it in host.
That being said, we compared to vhost-user, instead of vhost_net, because vhost-user is the one that is used in NFV, which we think is a major use case for vhost-pci.If this is true, why not draft a pmd driver instead of a kernel one? And do you use virtio-net kernel driver to compare the performance? If yes, has OVS dpdk optimized for kernel driver (I think not)? What's more important, if vhost-pci is faster, I think its kernel driver should be also faster than virtio-net, no?If you have a vhost CPU per VCPU and can give a host CPU to each using that will be faster. But not everyone has so many host CPUs.
If the major use case is NFV, we should have sufficient CPU resources I believe?
- make sure zerocopy is enabled for vhost_net - comment skb_orphan_frags() in tun_net_xmit() ThanksYou can even enable tx batching for tun by ethtool -C tap0 rx-frames N. This will greatly improve the performance according to my test.Thanks, but would this hurt latency? Best, WeiI don't see this in my test. Thanks