OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCHv2] pci: new configuration layout


On Wed, Sep 11, 2013 at 06:45:11PM +0300, Michael S. Tsirkin wrote:
> - split data path, common config and device specific config
> - support for new VQ layout
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

FYI this resolves issue VIRTIO-21

> ---
> 
> changes from v1:
> 	minimal patchset,
> 	stripped all controversial changes away:
> 	endian-ness, framing, revision id, config based access.
> 
> 	made some minor clarifications
> 
>  virtio-v1.0-wd01-part1-specification.txt | 320 +++++++++++++++++++++++++++++--
>  1 file changed, 301 insertions(+), 19 deletions(-)
> 
> diff --git a/virtio-v1.0-wd01-part1-specification.txt b/virtio-v1.0-wd01-part1-specification.txt
> index b0fa415..11be8bc 100644
> --- a/virtio-v1.0-wd01-part1-specification.txt
> +++ b/virtio-v1.0-wd01-part1-specification.txt
> @@ -747,9 +747,144 @@ Revision ID of 0 or 1.
>  2.3.1.2. PCI Device Layout
>  -------------------------
>  
> -To configure the device, we use the first I/O region of the PCI
> -device. This contains a virtio header followed by a
> -device-specific region.
> +To configure the device,
> +use I/O and/or memory regions and/or PCI configuration space of the PCI device.
> +These contain the virtio header registers, the notification register, the
> +ISR status register and device specific registers, as specified by Virtio
> ++ Structure PCI Capabilities
> +
> +There may be different widths of accesses to the I/O region; the
> +“natural” access method for each field must be
> +used (i.e. 32-bit accesses for 32-bit fields, etc).
> +
> +PCI Device Configuration Layout includes the common configuration,
> +ISR, notification and device specific configuration
> +structures.
> +
> +Unless explicitly specified otherwise, all multi-byte fields are little-endian.
> +
> +100.100.1.2.1. Common configuration structure layout
> +-------------------------
> +Common configuration structure layout is documented below:
> +
> +struct virtio_pci_common_cfg {
> +	/* About the whole device. */
> +	__le32 device_feature_select;	/* read-write */
> +	__le32 device_feature;		/* read-only */
> +	__le32 guest_feature_select;	/* read-write */
> +	__le32 guest_feature;		/* read-write */
> +	__le16 msix_config;		/* read-write */
> +	__le16 num_queues;		/* read-only */
> +	__u8 device_status;		/* read-write */
> +	__u8 unused1;
> +
> +	/* About a specific virtqueue. */
> +	__le16 queue_select;		/* read-write */
> +	__le16 queue_size;		/* read-write, power of 2, or 0. */
> +	__le16 queue_msix_vector;	/* read-write */
> +	__le16 queue_enable;		/* read-write */
> +	__le16 queue_notify_off;	/* read-only */
> +	__le64 queue_desc;		/* read-write */
> +	__le64 queue_avail;		/* read-write */
> +	__le64 queue_used;		/* read-write */
> +};
> +
> +device_feature_select
> +
> +	Selects which Feature Bits does device_feature field refer to.
> +	Value 0x0 selects Feature Bits 0 to 31
> +	Value 0x1 selects Feature Bits 32 to 63
> +	All other values cause reads from device_feature to return 0.
> +
> +device_feature
> +
> +	Used by Device to report Feature Bits to Driver.
> +	Device Feature Bits selected by device_feature_select.
> +
> +guest_feature_select
> +
> +	Selects which Feature Bits does guest_feature field refer to.
> +	Value 0x0 selects Feature Bits 0 to 31
> +	Value 0x1 selects Feature Bits 32 to 63
> +	All other values cause writes to guest_feature to be ignored,
> +	and reads to return 0.
> +
> +guest_feature
> +
> +	Used by Driver to acknowledge Feature Bits to Device.
> +	Guest Feature Bits selected by guest_feature_select.
> +
> +msix_config
> +
> +	Configuration Vector for MSI-X.
> +
> +num_queues
> +
> +	Specifies the maximum number of virtqueues supported by device.
> +
> +device_status
> +
> +	Device Status field.
> +
> +queue_select
> +
> +	Queue Select. Selects which virtqueue do other fields refer to.
> +
> +queue_size
> +
> +	Queue Size.  On reset, specifies the maximum queue size supported by
> +	the hypervisor. This can be modified by driver to reduce memory requirements.
> +	Set to 0 if this virtqueue is unused.
> +
> +queue_msix_vector
> +
> +	Queue Vector for MSI-X.
> +
> +queue_enable
> +
> +	Used to selectively prevent host from executing requests from this virtqueue.
> +	1 - enabled; 0 - disabled
> +
> +queue_notify_off
> +
> +	Used to calculate the offset from start of Notification structure at
> +	which this virtqueue is located.
> +	Note: this is *not* an offset in bytes. See notify_off_multiplier below.
> +
> +queue_desc
> +
> +	Physical address of Descriptor Table.
> +
> +queue_avail
> +
> +	Physical address of Available Ring.
> +
> +queue_used
> +
> +	Physical address of Used Ring.
> +
> +100.100.1.2.2. ISR status structure layout
> +-------------------------
> +ISR status structure includes a single 8-bite ISR status field
> +
> +100.100.1.2.3. Notification structure layout
> +-------------------------
> +Notification structure is always a multiple of 2 bytes in size.
> +It includes 2-byte Queue Notify fields for each virtqueue of
> +the device. Note that multiple virtqueues can use the same
> +Queue Notify field, if necessary.
> +
> +100.100.1.2.4. Device specific structure
> +-------------------------
> +
> +Device specific structure is optional.
> +
> +100.100.1.2.5. Legacy Interfaces: A Note on PCI Device Layout
> +-------------------------
> +
> +Transitional devices should present part of configuration
> +registers in a legacy configuration structure in BAR0 in the first I/O
> +region of the PCI device, as documented below.
>  
>  There may be different widths of accesses to the I/O region; the
>  “natural” access method for each field in the virtio header must be
> @@ -763,10 +898,7 @@ Note that this is possible because while the virtio header is PCI
>  region is encoded in the native endian of the guest (where such distinction is
>  applicable).
>  
> -2.3.1.2.1. PCI Device Virtio Header
> -----------------------------------
> -
> -The virtio header looks as follows:
> +When used through the legacy interface, the virtio header looks as follows:
>  
>  +------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+
>  | Bits       || 32                  | 32                  | 32       | 16     | 16      | 16      | 8       | 8      |
> @@ -805,25 +937,167 @@ device-specific headers:
>  |            ||                    |
>  +------------++--------------------+
>  
> +Note that only Feature Bits 0 to 31 are accessible through the
> +Legacy Interface. When used through the Legacy Interface,
> +Transitional Devices must assume that Feature Bits 32 to 63
> +are not acknowledged by Driver.
> +
>  2.3.1.3. PCI-specific Initialization And Device Operation
>  --------------------------------------------------------
>  
> -The page size for a virtqueue on a PCI virtio device is defined as
> -4096 bytes.
> -
>  2.3.1.3.1. Device Initialization
>  -------------------------------
>  
> +This documents PCI-specific steps executed during Device Initialization.
> +As the first step, driver must detect device configuration layout
> +to locate configuration fields in memory,I/O or configuration space of the
> +device.
> +
> +100.100.1.3.1.1. Virtio Device Configuration Layout Detection
> +-------------------------------
> +
> +As a prerequisite to device initialization, driver executes a
> +PCI capability list scan, detecting virtio configuration layout using Virtio
> +Structure PCI capabilities.
> +
> +Virtio Device Configuration Layout includes virtio configuration header, Notification
> +and ISR Status and device configuration structures.
> +Each structure can be mapped by a Base Address register (BAR) belonging to
> +the function, located beginning at 10h in Configuration Space,
> +or accessed though PCI configuration space.
> +
> +Actual location of each structure is specified using vendor-specific PCI capability located
> +on capability list in PCI configuration space of the device.
> +This virtio structure capability uses little-endian format; all bits are
> +read-only:
> +
> +struct virtio_pci_cap {
> +	__u8 cap_vndr;	/* Generic PCI field: PCI_CAP_ID_VNDR */
> +	__u8 cap_next;	/* Generic PCI field: next ptr. */
> +	__u8 cap_len;	/* Generic PCI field: capability length */
> +	__u8 cfg_type;	/* Identifies the structure. */
> +	__u8 bar;	/* Where to find it. */
> +	__u8 padding[3];/* Pad to full dword. */
> +	__le32 offset;	/* Offset within bar. */
> +	__le32 length;	/* Length of the structure, in bytes. */
> +};
> +
> +This structure can optionally followed by extra data, depending on
> +other fields, as documented below.
> +
> +The fields are interpreted as follows:
> +
> +cap_vndr
> +	0x09; Identifies a vendor-specific capability.
> +
> +cap_next
> +	Link to next capability in the capability list in the configuration space.
> +
> +cap_len
> +	Length of the capability structure, including the whole of
> +	struct virtio_pci_cap, and extra data if any.
> +	This length might include padding, or fields unused by the driver.
> +
> +cfg_type
> +	identifies the structure, according to the following table.
> +
> +	/* Common configuration */
> +	#define VIRTIO_PCI_CAP_COMMON_CFG	1
> +	/* Notifications */
> +	#define VIRTIO_PCI_CAP_NOTIFY_CFG	2
> +	/* ISR Status */
> +	#define VIRTIO_PCI_CAP_ISR_CFG		3
> +	/* Device specific configuration */
> +	#define VIRTIO_PCI_CAP_DEVICE_CFG	4
> +
> +	Any other value - reserved for future use. Drivers must
> +	ignore any vendor-specific capability structure which has
> +	a reserved cfg_type value.
> +
> +	More than one capability can identify the same structure - this makes it
> +	possible for the device to expose multiple interfaces to drivers.  The order of
> +	the capabilities in the capability list specifies the order of preference
> +	suggested by the device; drivers should use the first interface that they can
> +	support.  For example, on some hypervisors, notifications using IO accesses are
> +	faster than memory accesses. In this case, hypervisor can expose two
> +	capabilities with cfg_type set to VIRTIO_PCI_CAP_NOTIFY_CFG:
> +	the first one addressing an I/O BAR, the second one addressing a memory BAR.
> +	Driver will use the I/O BAR if I/O resources are available, and fall back on
> +	memory BAR when I/O resources are unavailable.
> +
> +bar
> +	values 0x0 to 0x5 specify a Base Address register (BAR) belonging to
> +	the function located beginning at 10h in Configuration Space
> +	and used to map the structure into Memory or I/O Space.
> +	The BAR is permitted to be either 32-bit or 64-bit, it can map Memory Space
> +	or I/O Space.
> +
> +	Any other value - reserved for future use. Drivers must
> +	ignore any vendor-specific capability structure which has
> +	a reserved bar value.
> +
> +offset
> +	indicates where the structure begins relative to the base address associated
> +	with the BAR.
> +
> +length
> +	indicates the length of the structure.
> +	This size might include padding, or fields unused by the driver.
> +	Drivers are also recommended to only map part of configuration structure
> +	large enough for device operation.
> +	For example, a future device might present a large structure size of several
> +	MBytes.
> +	As current devices never utilize structures larger than 4KBytes in size,
> +	driver can limit the mapped structure size to e.g.
> +	4KBytes to allow forward compatibility with such devices without loss of
> +	functionality and without wasting resources.
> +
> +
> +If cfg_type is VIRTIO_PCI_CAP_NOTIFY_CFG this structure is immediately followed
> +by additional fields:
> +
> +struct virtio_pci_notify_cap {
> +	struct virtio_pci_cap cap;
> +	__le32 notify_off_multiplier;	/* Multiplier for queue_notify_off. */
> +};
> +
> +notify_off_multiplier
> +
> +	Virtqueue offset multiplier, in bytes. Must be even and either a power of two, or 0.
> +	Value 0x1 is reserved.
> +	For a given virtqueue, the address to use for notifications is calculated as follows:
> +
> +	queue_notify_off * notify_off_multiplier + offset
> +
> +	If notify_off_multiplier is 0, all virtqueues use the same address in
> +	the Notifications structure!
> +
> +
> +100.100.1.3.1.1. Legacy Interface: A Note on Device Layout Detection
> +-------------------------------
> +
> +Legacy drivers skipped  Device Layout Detection step, assuming legacy
> +configuration space in BAR0 in I/O space unconditionally.
> +
> +Legacy devices did not have the Virtio PCI Capability in their
> +capability list.
> +
> +Therefore:
> +
> +Transitional devices should expose the Legacy Interface in I/O
> +space in BAR0.
> +
> +Transitional drivers should look for the Virtio PCI
> +Capabilities on the capability list.
> +If there are not present, driver should assume a legacy device.
> +
>  2.3.1.3.1.1. Queue Vector Configuration
>  --------------------------------------
>  
>  When MSI-X capability is present and enabled in the device
> -(through standard PCI configuration space) 4 bytes at byte offset
> -20 are used to map configuration change and queue interrupts to
> -MSI-X vectors. In this case, the ISR Status field is unused, and
> -device specific configuration starts at byte offset 24 in virtio
> -header structure. When MSI-X capability is not enabled, device
> -specific configuration starts at byte offset 20 in virtio header.
> +(through standard PCI configuration space) Configuration/Queue
> +MSI-X Vector registers are used to map configuration change and queue
> +interrupts to MSI-X vectors. In this case, the ISR Status is unused.
>  
>  Writing a valid MSI-X Table entry number, 0 to 0x7FF, to one of
>  Configuration/Queue Vector registers, maps interrupts triggered
> @@ -878,12 +1152,17 @@ This is done as follows, for each virtqueue a device has:
>    Queue Vector field: on success, previously written value is
>    returned; on failure, NO_VECTOR value is returned.
>  
> +100.100.1.3.1.4.1. Legacy Interface: A Note on Virtqueue Configuration
> +-----------------------------------
> +When using the legacy interface, the page size for a virtqueue on a PCI virtio
> +device is defined as 4096 bytes.  Driver writes the physical address, divided
> +by 4096 to the Queue Address field [6].
> +
>  2.3.1.3.2. Notifying The Device
>  ------------------------------
>  
>  Device notification occurs by writing the 16-bit virtqueue index
> -of this virtqueue to the Queue Notify field of the virtio header
> -in the first I/O region of the PCI device.
> +of this virtqueue to the Queue Notify field.
>  
>  2.3.1.3.3. Virtqueue Interrupts From The Device
>  ----------------------------------------------
> @@ -2933,7 +3212,10 @@ the non-PCI implementations (currently lguest and S/390).
>  This is only allowed if the driver does not use any features
>  which would alter this early use of the device.
>  
> -[5] ie. once you enable MSI-X on the device, the other fields move.
> +[5] When MSI-X capability is enabled, device specific configuration starts at
> +byte offset 24 in virtio header structure. When MSI-X capability is not
> +enabled, device specific configuration starts at byte offset 20 in virtio
> +header.  ie. once you enable MSI-X on the device, the other fields move.
>  If you turn it off again, they move back!
>  
>  [6] The 4096 is based on the x86 page size, but it's also large
> -- 
> MST


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]