[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v9] virtio-net: support inner header hash
å 2023/3/8 äå10:39, Michael S. Tsirkin åé:
On Wed, Mar 01, 2023 at 10:56:31AM +0800, Heng Qi wrote:å 2023/2/28 äå7:16, Michael S. Tsirkin åé:On Sat, Feb 18, 2023 at 10:37:15PM +0800, Heng Qi wrote:If the tunnel is used to encapsulate the packets, the hash calculated using the outer header of the receive packets is always fixed for the same flow packets, i.e. they will be steered to the same receive queue.Wait a second. How is this true? Does not everyone stick the inner header hash in the outer source port to solve this?Yes, you are right. That's what we did before the inner header hash, but it has a performance penalty, which I'll explain below.For example geneve spec says: it is necessary for entropy from encapsulated packets to be exposed in the tunnel header. The most common technique for this is to use the UDP source portThe end point of the tunnel called the gateway (with DPDK on top of it). 1. When there is no inner header hash, entropy can be inserted into the udp src port of the outer header of the tunnel, and then the tunnel packet is handed over to the host. The host needs to take out a part of the CPUs to parse the outer headers (but not drop them) to calculate the inner hash for the inner payloads, and then use the inner hash to forward them to another part of the CPUs that are responsible for processing.I don't get this part. Leave inner hashes to the guest inside the tunnel, why is your host doing this?
Assuming that the same flow includes a unidirectional flow a->b, or a bidirectional flow a->b and b->a,
such flow may be out of order when processed by the gateway(DPDK):1. In unidirectional mode, if the same flow is switched to another gateway for some reason, resulting in different outer IP address, ÂÂÂ then this flow may be processed by different CPUs after reaching the host if there is no inner hash. So after the host receives the ÂÂÂ flow, first use the forwarding CPUs to parse the inner hash, and then use the hash to ensure that the flow is processed by the
ÂÂÂ same CPU.2. In bidirectional mode, a->b flow may go to gateway 1, and b->a flow may go to gateway 2. In order to ensure that the same flow is ÂÂÂ processed by the same CPU, we still need the forwarding CPUs to parse the real inner hash(here, the hash key needs to be replaced with a symmetric hash key).
1). During this process, the CPUs on the host is divided into two parts, one part is used as a forwarding node to parse the outer header, ÂÂÂÂ and the CPU utilization is low. Another part handles packets.Some overhead is clearly involved in *sending* packets - to calculate the hash and stick it in the port number. This is, however, a separate problem and if you want to solve it then my suggestion would be to teach the *transmit* side about GRE offloads, so it can fill the source port in the card.2). The entropy of the source udp src port is not enough, that is, the queue is not widely distributed.how isn't it enough? 16 bit is enough to cover all vqs ...
A 5-tuple brings more entropy than a single port, doesn't it? In fact, the inner hash of the physical network card used by the business team is indeed better than the udp port number of the outer header we modify now, but they did not give me the data.
2. When there is an inner header hash, the gateway will directly help parse the outer header, and use the inner 5 tuples to calculate the inner hash. The tunneled packet is then handed over to the host. 1) All the CPUs of the host are used to process data packets, and there is no need to use some CPUs to forward and parse the outer header.You really have to parse the outer header anyway, otherwise there's no tunneling. Unless you want to teach virtio to implement tunneling in hardware, which is something I'd find it easier to get behind.
There is no need to parse the outer header twice, because we use shared memory.
2) The entropy of the original quintuple is sufficient, and the queue is widely distributed.It's exactly the same entropy, why would it be better? In fact you are taking out the outer hash entropy making things worse.
I don't get the point, why the entropy of the inner 5-tuple and the outer tunnel header is the same,
multiple streams have the same outer header. Thanks.
Thanks.same goes for vxlan did not check further. so what is the problem? and which tunnel types actually suffer from the problem?This publicly archived list offers a means to provide input to the OASIS Virtual I/O Device (VIRTIO) TC. In order to verify user consent to the Feedback License terms and to minimize spam in the list archive, subscription is required before posting. Subscribe: virtio-comment-subscribe@lists.oasis-open.org Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org List help: virtio-comment-help@lists.oasis-open.org List archive: https://lists.oasis-open.org/archives/virtio-comment/ Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists Committee: https://www.oasis-open.org/committees/virtio/ Join OASIS: https://www.oasis-open.org/join/--------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]