[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio] [OASIS Issue Tracker] Created: (VIRTIO-35) race condition with multi-dword config accesses
"Michael S. Tsirkin" <mst@redhat.com> writes: > On Tue, Sep 24, 2013 at 08:18:20PM +0930, Rusty Russell wrote: >> I don't think qemu supports changing a disk underneath a device anyway? > > QEMU supports resizing I think. > >> And in practice, I'm not sure it would ever change fast enough to >> trigger this race... > > Well it's easy to make guest very slow :) > >> So, should we try to fix it? >> Rusty. > > Not really sure but I thought I'd put it on the table. OK, I got ambitious! How's this look? Cheers, Rusty. commit 9aebe9c69be1461dd0110dfc61459a4ac7bda6f9 Author: Rusty Russell <rusty@au1.ibm.com> Date: Fri Oct 4 13:01:35 2013 +0930 Configuration (read) atomicity. Aka issue VIRTIO-35. This is solved per transport: 1) PCI: use the 8 bit reserved field. Assume that if you really change that fast, you'll do it lazily on config space read. 2) MMIO: use offset 0x18. Qemu returns 0 when they read here anyway, so it's almost 100% backwards compatible (legacy drivers should still multi-read for any transport). 3) CCW: no transport changes. They always read/write the entire thing. This just shows that Cornelia is smarter than I am. Signed-off-by: Rusty Russell <rusty@au1.ibm.com> diff --git a/virtio-v1.0-wd01-part1-specification.txt b/virtio-v1.0-wd01-part1-specification.txt index b5a5835..2945db4 100644 --- a/virtio-v1.0-wd01-part1-specification.txt +++ b/virtio-v1.0-wd01-part1-specification.txt @@ -201,11 +201,36 @@ Interface' in the section title. ------------------------- Configuration space is generally used for rarely-changing or -initialization-time parameters. +initialization-time parameters. Drivers must not assume reads from +fields greater than 32 bits wide are atomic, nor or reads from +multiple fields. + +Each transport provides a generation count for the configuration +space, which must change whenever there is a possibility that two +accesses to the configuration space can see different versions of that +space. + +Thus drivers should read configuration space fields like so: + + u32 before, after; + do { + before = get_config_generation(device); + // read config entry/entries. + after = get_config_generation(device); + } while (after != before); Note that this space is generally the guest's native endian, rather than PCI's little-endian. +2.1.3.1. Legacy Interface: Configuration Space +------------------------- + +Legacy devices did not have a configuration generation field, thus are +susceptible to race conditions if configuration is updated. This +effects the block capacity and network mac fields; best practice is to +read these fields multiple times until two reads generate a consistent +result. + 2.1.4. Virtqueues ---------------- @@ -768,7 +793,7 @@ struct virtio_pci_common_cfg { __le16 msix_config; /* read-write */ __le16 num_queues; /* read-only */ __u8 device_status; /* read-write */ - __u8 unused1; + __u8 config_generation; /* read-only */ /* About a specific virtqueue. */ __le16 queue_select; /* read-write */ @@ -820,6 +845,14 @@ device_status Device Status field. Writing 0 into this field resets the device. +config_generation + + Configuration atomicity value. Changes every time the + configuration noticeably changes. This means the device may + only change the value after a configuration read operation, + but it must change if there is any risk of a device seeing an + inconsistent configuration state. + queue_select Queue Select. Selects which virtqueue do other fields refer to. @@ -941,6 +974,9 @@ Legacy Interface. When used through the Legacy Interface, Transitional Devices must assume that Feature Bits 32 to 63 are not acknowledged by Driver. +As legacy devices had no configuration generation field, see "2.1.3.1. +Legacy Interface: Configuration Space" for workarounds. + 2.3.1.3. PCI-specific Initialization And Device Operation -------------------------------------------------------- @@ -1281,6 +1317,13 @@ configuration space. The following list presents their layout: must write a value to the HostFeaturesSel register before reading from the HostFeatures register. +• 0x018 | R | ConfigGeneration + Configuration atomicity value. + Changes every time the configuration noticeably changes. This means the + device may only change the value after a configuration read operation, + but it must change if there is any risk of a device seeing an inconsistent + configuration state. + • 0x020 | W | GuestFeatures Flags representing device features understood and activated by the driver. @@ -1396,6 +1439,12 @@ Writing to registers described as “R” and reading from registers described as “W” is not permitted and can cause undefined behavior. +2.3.2.2.1. Legacy Interface: MMIO Device Layout +-------------------------- +The ConfigGeneration field does not exist in legacy devices; fortunately +it would return 0 if accessed. Nonetheless, the workarounds in +"2.1.3.1. Legacy Interface: Configuration Space" should still be used. + 100.3.2.2.1. Virtqueue Layout ------------------------------ @@ -1631,7 +1680,9 @@ For changing configuration information, the guest may use CCW_CMD_WRITE_CONF, specifying the guest memory for the host to read from. -In both cases, the complete configuration space is transmitted. +In both cases, the complete configuration space is transmitted. This +allows the guest to compare the new configuration space with the old +version, and keep a generation count internally whenever it changes. 2.3.3.2.5. Setting Up Indicators -------------------------------- @@ -3415,12 +3466,15 @@ transmit output. --------------------------------------- Configuration space should only be used for initialization-time -parameters. It is a limited resource with no synchronization, so for -most uses it is better to use a virtqueue to update configuration -information (the network device does this for filtering, +parameters. It is a limited resource with no synchronization between +writable fields, so for most uses it is better to use a virtqueue to update +configuration information (the network device does this for filtering, otherwise the table in the config space could potentially be very large). +Devices must not assume that configuration fields over 32 bits wide +are atomically writable. + 2.7.3. What Device Number? --------------------------
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]