OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] [OASIS Issue Tracker] Created: (VIRTIO-35) race condition with multi-dword config accesses


"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Tue, Sep 24, 2013 at 08:18:20PM +0930, Rusty Russell wrote:
>> I don't think qemu supports changing a disk underneath a device anyway?
>
> QEMU supports resizing I think.
>
>> And in practice, I'm not sure it would ever change fast enough to
>> trigger this race...
>
> Well it's easy to make guest very slow :)
>
>> So, should we try to fix it?
>> Rusty.
>
> Not really sure but I thought I'd put it on the table.

OK, I got ambitious!  How's this look?

Cheers,
Rusty.

commit 9aebe9c69be1461dd0110dfc61459a4ac7bda6f9
Author: Rusty Russell <rusty@au1.ibm.com>
Date:   Fri Oct 4 13:01:35 2013 +0930

    Configuration (read) atomicity.
    
    Aka issue VIRTIO-35.
    
    This is solved per transport:
    1) PCI: use the 8 bit reserved field.
    	Assume that if you really change that fast, you'll do it lazily on
    	config space read.
    2) MMIO: use offset 0x18.
    	Qemu returns 0 when they read here anyway, so it's almost 100%
    	backwards compatible (legacy drivers should still multi-read
    	for any transport).
    3) CCW: no transport changes.
    	They always read/write the entire thing.  This just shows that
    	Cornelia is smarter than I am.
    
    Signed-off-by: Rusty Russell <rusty@au1.ibm.com>

diff --git a/virtio-v1.0-wd01-part1-specification.txt b/virtio-v1.0-wd01-part1-specification.txt
index b5a5835..2945db4 100644
--- a/virtio-v1.0-wd01-part1-specification.txt
+++ b/virtio-v1.0-wd01-part1-specification.txt
@@ -201,11 +201,36 @@ Interface' in the section title.
 -------------------------
 
 Configuration space is generally used for rarely-changing or
-initialization-time parameters.
+initialization-time parameters.  Drivers must not assume reads from
+fields greater than 32 bits wide are atomic, nor or reads from
+multiple fields.
+
+Each transport provides a generation count for the configuration
+space, which must change whenever there is a possibility that two
+accesses to the configuration space can see different versions of that
+space.
+
+Thus drivers should read configuration space fields like so:
+
+	u32 before, after;
+	do {
+		before = get_config_generation(device);
+		// read config entry/entries.
+		after = get_config_generation(device);
+	} while (after != before);
 
 Note that this space is generally the guest's native endian,
 rather than PCI's little-endian.
 
+2.1.3.1.  Legacy Interface: Configuration Space
+-------------------------
+
+Legacy devices did not have a configuration generation field, thus are
+susceptible to race conditions if configuration is updated.  This
+effects the block capacity and network mac fields; best practice is to
+read these fields multiple times until two reads generate a consistent
+result.
+
 2.1.4. Virtqueues
 ----------------
 
@@ -768,7 +793,7 @@ struct virtio_pci_common_cfg {
 	__le16 msix_config;		/* read-write */
 	__le16 num_queues;		/* read-only */
 	__u8 device_status;		/* read-write */
-	__u8 unused1;
+	__u8 config_generation;		/* read-only */
 
 	/* About a specific virtqueue. */
 	__le16 queue_select;		/* read-write */
@@ -820,6 +845,14 @@ device_status
 	Device Status field. Writing 0 into this field resets the
 	device.
 
+config_generation
+
+	Configuration atomicity value.  Changes every time the
+	configuration noticeably changes.  This means the device may
+	only change the value after a configuration read operation,
+	but it must change if there is any risk of a device seeing an
+	inconsistent configuration state.
+
 queue_select
 
 	Queue Select. Selects which virtqueue do other fields refer to.
@@ -941,6 +974,9 @@ Legacy Interface. When used through the Legacy Interface,
 Transitional Devices must assume that Feature Bits 32 to 63
 are not acknowledged by Driver.
 
+As legacy devices had no configuration generation field, see "2.1.3.1.
+Legacy Interface: Configuration Space" for workarounds.
+
 2.3.1.3. PCI-specific Initialization And Device Operation
 --------------------------------------------------------
 
@@ -1281,6 +1317,13 @@ configuration space. The following list presents their layout:
   must write a value to the HostFeaturesSel register before
   reading from the HostFeatures register.
 
+• 0x018 | R | ConfigGeneration
+ Configuration atomicity value.
+ Changes every time the configuration noticeably changes.  This means the
+ device may only change the value after a configuration read operation,
+ but it must change if there is any risk of a device seeing an inconsistent
+ configuration state.
+
 • 0x020 | W | GuestFeatures
  Flags representing device features understood and activated by
   the driver.
@@ -1396,6 +1439,12 @@ Writing to registers described as “R” and reading from
 registers described as “W” is not permitted and can cause
 undefined behavior.
 
+2.3.2.2.1. Legacy Interface: MMIO Device Layout
+--------------------------
+The ConfigGeneration field does not exist in legacy devices; fortunately
+it would return 0 if accessed.  Nonetheless, the workarounds in
+"2.1.3.1. Legacy Interface: Configuration Space" should still be used.
+
 100.3.2.2.1.  Virtqueue Layout
 ------------------------------
 
@@ -1631,7 +1680,9 @@ For changing configuration information, the guest may use
 CCW_CMD_WRITE_CONF, specifying the guest memory for the host to
 read from.
 
-In both cases, the complete configuration space is transmitted.
+In both cases, the complete configuration space is transmitted.  This
+allows the guest to compare the new configuration space with the old
+version, and keep a generation count internally whenever it changes.
 
 2.3.3.2.5. Setting Up Indicators
 --------------------------------
@@ -3415,12 +3466,15 @@ transmit output.
 ---------------------------------------
 
 Configuration space should only be used for initialization-time
-parameters.  It is a limited resource with no synchronization, so for
-most uses it is better to use a virtqueue to update configuration
-information (the network device does this for filtering,
+parameters.  It is a limited resource with no synchronization between
+writable fields, so for most uses it is better to use a virtqueue to update
+configuration information (the network device does this for filtering,
 otherwise the table in the config space could potentially be very
 large).
 
+Devices must not assume that configuration fields over 32 bits wide
+are atomically writable.
+
 2.7.3. What Device Number?
 --------------------------
 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]