OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Problems with VIRTIO-4 and writeback only disks


James Bottomley <jbottomley@parallels.com> writes:
> [resending to virtio-comment; it looks like I'm not subscribed to
> virtio-dev ... how do you subscribe?]

Mail to virtio-dev-subscribe@lists.oasis-open.org, or via
        https://www.oasis-open.org/mlmanage/

BTW, I've moved this to virtio@ since it's core business, with virtio-comment
cc'd.

> Sorry, I don't have a copy of the original email to reply to:
>
> https://lists.oasis-open.org/archives/virtio-comment/201308/msg00078.html
>
> The part that concerns me is this:
>
>> +5. The cache mode should be read from the writeback field of the configuration
>> +  if the VIRTIO_BLK_F_CONFIG_WCE feature if available; the driver can also
>> +  write to the field in order to toggle the cache between writethrough (0)
>> +  and writeback (1) mode.
>> +  If the feature is not available, the driver can instead look at the result
>> +  of negotiating VIRTIO_BLK_F_WCE: the cache will be in writeback mode after
>> +  reset if and only if VIRTIO_BLK_F_WCE is negotiated[30]
>
> The questions are twofold and have to do with Write Back only disks (to
> date we've seen quite a few ATA devices like this and a huge number of
> USB devices):
>
>      1. If the guest doesn't negotiate WCE, what do you do on the host
>         (flush on every write is one possible option; run unsafe and
>         hope the host doesn't crash is another).
>      2. If the guest asks to toggle the device from writeback (1) to
>         writethrough (0) mode, what do you do?  Refuse the toggle would
>         be reasonable or flip back into whatever mode you were using to
>         handle 1. is also possible.
>
> James

I thought about this more after the call.  If we look at block device
implementations on the host:

1) Dumb device (ie. no flush support).
   - Get write request, write() to backing file.  Repeat.
   - If guest crashes it always sees in order, if host crashes you're
     out of luck.

2) Dumb device which tries to handle host crashes.
   - Noone wants this: requires a fdatasync() after every write.

3) Smart device.  Uses AIO/threads to service requests.
   - Needs flushes otherwise if guest crashes it can see out of order.
   - Flushes can must wait for outstanding requests.

4) Smart device which tries to handle host crashes.
   - Flushes must fdatasync() after waiting.

The interesting question is between 3 & 4:
- Do we differentiate 3 and 4 from the guest side?
- Or do we ban 3 and insist on 4?  Knowing that there are no guarantees that an
  implementation will actually hit the metal (eg. crappy underlying
  device or crappy non-barrier filesystem).

Whatever we do, I don't see why we'd want to toggle WCE after
negotiation.  If you implement a smart device, you'd need to drop to a
single thread, but you'd definitely lose host-crash reliability.

Cheers,
Rusty.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]