[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: packed ring layout proposal
Let's start a discussion around an alternative ring layout. This has been in my kvm forum 2016 presentation. The idea is to have a r/w descriptor in a ring structure, replacing the used and available ring, index and descriptor buffer. * Descriptor ring: Guest adds descriptors with unique index values and DESC_HW set in flags. Host overwrites used descriptors with correct len, index, and DESC_HW clear. Flags are always set/cleared last. #define DESC_HW 0x0080 struct desc { __le64 addr; __le32 len; __le16 index; __le16 flags; }; When DESC_HW is set, descriptor belongs to device. When it is clear, it belongs to the driver. * Scatter/gather support We can use 3 bits to set direction and to chain s/g entries in a request, same as virtio 1.0: /* This marks a buffer as continuing via the next field. */ #define VRING_DESC_F_NEXT 1 /* This marks a buffer as write-only (otherwise read-only). */ #define VRING_DESC_F_WRITE 2 Unlike virtio 1.0, all descriptors must have distinct ID values. * Indirect buffers Can be done like in virtio 1.0: /* This means the buffer contains a list of buffer descriptors. */ #define VRING_DESC_F_INDIRECT 4 In the descriptors in the indirect buffers, I think we should drop index field altogether, just put s/g entries one after the other. Also, length in the indirect descriptor should mark the list of the chain. virtio 1.0 seems to allow a s/g entry followed by an indirect descriptor. This does not seem useful, so let's not allow that anymore. * Batching descriptors: virtio 1.0 allows passing a batch of descriptors in both directions, by incrementing the used/avail index by values > 1. We can support this by tagging first/middle/last descriptors in the flags field. For comptibility, polarity is reversed so bit is set for non-first and non-last descriptors in a batch: #define BATCH_NOT_FIRST 0x0010 #define BATCH_NOT_LAST 0x0020 In other words only descriptor in batch is 0000. Batch does not have to be processed together, so !first/!last flags can be changed when descriptor is used. * Processing descriptors in and out of order Device processing all descriptors in order can simply flip the DESC_HW bit as it is done with descriptors. Device can process descriptors out of order, and write them out in order as they are used, overwriting descriptors that are there. Device must not use a descriptor until DESC_HW is set. It is only required to look at the first descriptor submitted, but it is allowed to look anywhere in the ring. This might allow parallel processing. Driver must not overwrite a descriptor until DESC_HW is clear. It is only required to look at the first descriptor submitted, but it is allowed to look anywhere in the ring. This might allow parallel processing. * Interrupt/event suppression virtio 1.0 has two mechanisms for suppression but only one can be used at a time. Let's pack them together in a structure - one for interrupts, one for notifications: struct event { __le16 idx; __le16 flags; } Flags can be used like in virtio 1.0, by storing a special value there: #define VRING_F_EVENT_ENABLE 0x0 #define VRING_F_EVENT_DISABLE 0x1 or it can be used to switch to event idx: #define VRING_F_EVENT_IDX 0x2 in that case, interrupt triggers when descriptor with a given index value is used. --- Note: should this proposal be accepted and approved, one or more claims disclosed to the TC admin and listed on the Virtio TC IPR page https://www.oasis-open.org/committees/virtio/ipr.php might become Essential Claims. -- MST
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]