[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [virtio-comment] Hardware friendly proposals from Intel for packed-ring-layout
Hi I've embedded my comments below. Rgds Kully On Thu, Aug 24, 2017 at 07:53:15PM +0800, Tiwei Bie wrote: > Rx Fixed Buffer Sizes > ===================== > > ## Current proposal > > * Driver is free to choose whatever buffer sizes it wishes for Tx and > Rx buffers > * Theoretically within a ring, a driver could have different buffer > sizes > > ## New proposal > > * Driver negotiates with device the size of a Rx buffer for a ring > - Each descriptor in that ring will have same size buffer > - Different rings can have different sized buffers What's the motivation for this? In our testing dynamically sized entries perform better in contrained environments such as the linux kernel where packets are queued at a huge number of independent application sockets. It seems that device can easily cache the last size to speed up operation. [Kully]: Device incurs around 1us delay fetching each descriptor. In situations whereby memory is limited on the device and many queues are being supported, device would probably fetch descriptors (for Rx) after packets have been received from the network. Knowing upfront the buffer size associated with a ring, would allow the device to be able to accurately determine how many descriptors are required. Yes, agreed that overall system performance is important. Are sockets intended to be used with virtio drivers? If so, would the driver not allocate a different queue per socket? The proposal was to have fixed buffer sizes per ring but different rings can have different buffer sizes. Would this comprise of different sizes per ring as opposed to per entry work? > Data Alignment Boundaries > ========================= > > ## Current proposal > > * Driver is free to choose data buffer alignment to any byte boundary > > ## New proposal > > * Stipulate a fixed alignment for the data buffer Again motivation seems to be missing. Saving PCI bandwidth isn't going to help if it means driver will then incur more cache misses on access. [Kully] Would s/w not benefit from buffers which start aligned to a cache line boundary (i.e. 64B)? This would also benefit hardware. -- MST --------------------------------------------------------------------- Intel Corporation (UK) Limited Registered No. 1134945 (England) Registered Office: Pipers Way, Swindon SN3 1RJ VAT No: 860 2173 47 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.