OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

# virtio message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio] [PATCH v2] virtio-blk: restore VIRTIO_BLK_F_FLUSH and VIRTIO_BLK_F_CONFIG_WCE

• From: Paolo Bonzini <pbonzini@redhat.com>
• To: "Michael S. Tsirkin" <mst@redhat.com>
• Date: Thu, 2 Jul 2015 20:39:17 +0200


On 02/07/2015 19:12, Michael S. Tsirkin wrote:
>>>> I can't do this inside an {enumerate}.  It is long-ish, but it's just
>>>> documenting what existing hardware does.
>>>
>>> Why not? I just tried this: note nests fine within enumerate.
>>
>> I get an error from LaTeX.
>
> Do you mean XeTex? Also - how did you write it?

\item foo

\begin{note}
blah
\end{note}

\item bar

But let's discuss the text for now.

>>>>> Also, what's the reset value if it is negotiated?
>>>>
>>>> It is not specified.  The device can make it persistent or reset it to a
>>>> user-specified value.
>>>
>>> Does driver have to read it on init then?
>>
>> Read it or set it.  Linux does the latter.
>
> Is there value in specifying a non zero value on reset?

Yes: it makes the guest do what the host administrator thinks it should
do by default, effectively forwarding the host's writethrough/writeback
choice to the guest.

> Maybe we can make this 0 unconditionally?

That would be a behavior change that I'm wary of introducing.

>>>     if (!virtio_has_feature(vdev, VIRTIO_BLK_F_CONFIG_WCE)) {
>>>         aio_context_acquire(blk_get_aio_context(s->blk));
>>>         blk_set_enable_write_cache(s->blk,
>>>                                    virtio_has_feature(vdev, VIRTIO_BLK_F_WCE));
>>>         aio_context_release(blk_get_aio_context(s->blk));
>>>     }
>>>
>>>
>>> is this what this text is trying to say?
>>
>> This is part of the behavior that I want to clarify.
>
> Can't say I see how this maps into the text.
> you don't document what happens without VIRTIO_BLK_F_CONFIG_WCE
> at all, do you?

It's the "neither VIRTIO_BLK_F_CONFIG_WCE nor VIRTIO_BLK_F_FLUSH feature
were negotiated, but VIRTIO_BLK_F_FLUSH was proposed by the device" bit.

Note that the spec mentions it as part of device operation, not
implementation, because the existence of a "blk_set_enable_write_cache"
API is a QEMU detail that doesn't belong in the spec.

>> The point of the first SHOULD is simply to document reasonable behavior
>> for the case of simple drivers that do not care about data loss.
>
> Hmm why do such *drivers* exist, or should be allowed?

They exist because they're old (MS-DOS anyone?).  Because they're
writing transient data.  Because they are configured with the "there's a
UPS or a battery-backed cache somewhere" option.  Linux SCSI disks have
such an option ("echo temporary write through >
/sys/block/sda/device/scsi_disk/*/cache_type"), though virtio-blk doesn't.

Which also answers why they should be allowed: because it's not our

>> Devices that do not support VIRTIO_BLK_F_FLUSH probably would not obey
>> that SHOULD.  But devices that do support VIRTIO_BLK_F_FLUSH need to do
>> better, hence the MUST.
>>
>> Do you think the MUST prevents the trick with "-snapshot"?  I think it
>> doesn't, because the guest cannot see the difference.
>
> Well MUST means if you don't do it you violate spec.

Not necessarily.

If the guest does not see the difference (e.g., won't see the data after
a crash anyway, because of -snapshot), you can do whatever you want.
It's the same "as is" rule that governs compiler optimizations---or
pretty much any other optimization.

>>  (And I think even
>> "MUST"s can be overridden by a user's "I know what I am doing" option
>> such as QEMU's cache.no-flush=on).
>
> I think that's exactly the SHOULD. SHOULD means "do this always unless
> you really know what you are doing". Check out the RFC.

There are (at least) two levels in addition to MUST and SHOULD:

a) it is required to do X.  ~X puts you outside the spec.
No ifs, no buts

b) it is required to do X, but you can do ~X if no one notices.  If
this causes problems, it's your bug.

c) it is required to do X, but the user may force you to do ~X
instead.  If this causes problems, it's the user's bug.

d) it is recommended to do X.  ~X is a fact of life.

(a) is MUST.  (d) is SHOULD.  The question is, what is (b) and (c)?

If we use SHOULD here ("X = avoid data losses after crashes"), we're
saying that data loss after a crash is a fact of life.  I don't think
that's reasonable.

So, I believe (b) is also MUST and there are even specs that acknowledge
this, for example in RFC2616 (HTTP):

When a received protocol element is parsed, the recipient MUST be
able to parse any value of reasonable length that is applicable to
the recipient's role and that matches the grammar defined by the
corresponding ABNF rules.  Note, however, that some received protocol
elements might not be parsed.  For example, an intermediary
forwarding a message might parse a header-field into generic
field-name and field-value components, but then forward the header
field without further parsing inside the field-value.

And I strongly suspect it's more than just intermediaries who do this...

(c) means the user can put you outside the spec, but that's fair enough.
Perhaps the user knows that you're really in the (b) situation.  For
example you're installing a new VM, and if there's a crash you'll just
throw it away.  No one will notice if data isn't flushed.

>>> Above should be conformance statements for the driver.
>>
>> I'll add something, but I don't think there should be conformance