OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio] [OASIS Issue Tracker] Created: (VIRTIO-35) race condition with multi-dword config accesses


On Fri, Oct 11, 2013 at 09:54:12AM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Fri, Oct 04, 2013 at 01:06:43PM +0930, Rusty Russell wrote:
> >> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >> > On Tue, Sep 24, 2013 at 08:18:20PM +0930, Rusty Russell wrote:
> >> >> I don't think qemu supports changing a disk underneath a device anyway?
> >> >
> >> > QEMU supports resizing I think.
> >> >
> >> >> And in practice, I'm not sure it would ever change fast enough to
> >> >> trigger this race...
> >> >
> >> > Well it's easy to make guest very slow :)
> >> >
> >> >> So, should we try to fix it?
> >> >> Rusty.
> >> >
> >> > Not really sure but I thought I'd put it on the table.
> >> 
> >> OK, I got ambitious!  How's this look?
> >> 
> >> Cheers,
> >> Rusty.
> >> 
> >> commit 9aebe9c69be1461dd0110dfc61459a4ac7bda6f9
> >> Author: Rusty Russell <rusty@au1.ibm.com>
> >> Date:   Fri Oct 4 13:01:35 2013 +0930
> >> 
> >>     Configuration (read) atomicity.
> >>     
> >>     Aka issue VIRTIO-35.
> >>     
> >>     This is solved per transport:
> >>     1) PCI: use the 8 bit reserved field.
> >>     	Assume that if you really change that fast, you'll do it lazily on
> >>     	config space read.
> >
> > I guess it's possible ... but isn't it easier to just make it a 32 bit field
> > than add this lazy update logic to devices?
> 
> I was trying to keep it tight.

Well we always find some use for the padding space later,
or we can move some stuff around to pack it ...

>  I think the lazy update logic is
> pretty trivial:
> 
>         bool config_changed;
>         ....
> 
> 
>         on_config_generation_read()
>         {
>                 if (config_changed) {
>                         gen++;
>                         config_changed = false;
>                 }
>                 return gen;
>         }

And something similar on migration. I agree it's not too bad.
But we'll need to document this in the spec somehow.

> >>     2) MMIO: use offset 0x18.
> >>     	Qemu returns 0 when they read here anyway, so it's almost 100%
> >>     	backwards compatible (legacy drivers should still multi-read
> >>     	for any transport).
> >>     3) CCW: no transport changes.
> >>     	They always read/write the entire thing.  This just shows that
> >>     	Cornelia is smarter than I am.
> >>     
> >>     Signed-off-by: Rusty Russell <rusty@au1.ibm.com>
> >> 
> >> diff --git a/virtio-v1.0-wd01-part1-specification.txt b/virtio-v1.0-wd01-part1-specification.txt
> >> index b5a5835..2945db4 100644
> >> --- a/virtio-v1.0-wd01-part1-specification.txt
> >> +++ b/virtio-v1.0-wd01-part1-specification.txt
> >> @@ -201,11 +201,36 @@ Interface' in the section title.
> >>  -------------------------
> >>  
> >>  Configuration space is generally used for rarely-changing or
> >> -initialization-time parameters.
> >> +initialization-time parameters.  Drivers must not assume reads from
> >> +fields greater than 32 bits wide are atomic, nor or reads from
> >> +multiple fields.
> >> +
> >> +Each transport provides a generation count for the configuration
> >> +space, which must change whenever there is a possibility that two
> >
> > MUST?
> >
> >> +accesses to the configuration space can see different versions of that
> >> +space.
> >> +
> >> +Thus drivers should read configuration space fields like so:
> >
> > SHOULD?
> 
> The convention is to use lower case for OASIS.  It says in the template
> preamble.  I guess it's bold in more sophisticated formats.
> 
> >> +
> >> +	u32 before, after;
> >> +	do {
> >> +		before = get_config_generation(device);
> >> +		// read config entry/entries.
> >> +		after = get_config_generation(device);
> >> +	} while (after != before);
> >
> > BTW this implies reads must never have side
> > effects (PCI ISR is the only exception, isn't it?).
> 
> Yes.  That would be weird...
> 
> Cheers,
> Rusty.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]