OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio-comment] Google Comments on Virtio Draft Spec

Il 05/06/2014 09:45, Andrew Thornton ha scritto:
>> 'Configuration'
>>> Device configuration layout
> 1. max_sectors and cmd_per_lun are described as 'hints'
> 1.1. Can these become hard limits rather than 'hints'? (IE
>      devices can reject commands above the cmd_per_lun limit
>      or the max_sectors limit). If so, can we select a specific
>      error to return in that case?

First of all, I think it's not necessary to select an error for these 
cases.  These issues are not specific to virtio-scsi and the command 
will succeed at the virtio-scsi level; for cmd_per_lun the the status 
could be BUSY (not VIRTIO_SCSI_S_BUSY!) or TASK SET FULL, for transfer 
length the SCSI standard says you get INVALID FIELD IN CDB.  These 
status or sense codes are defined in the appropriate SCSI standards.

The configuration limits are imposed by the hypervisor, so transfer 
lengths or queue depths higher than the values in the configuration 
should cause an error.  The reason these are hints is because the issue 
is quite complex if you do SCSI passthrough, and in that case a transfer 
length or queue depth lower than the limit could trigger an error.

For example, each target or LUN could actually have its own transfer 
limit, that is lower than max_sectors.  In this case the initiator 
should look for the block limits VPD page anyway.

As to cmd_per_lun, you could obey cmd_per_lun and still get TASK SET 
FULL responses from the target if the host or other guests are using it 
at the same time.  Perhaps the hypervisor could change that to BUSY 
(again, not VIRTIO_SCSI_S_BUSY), but this is again a generic SCSI target 
implementation issue, not specific to virtio-scsi.

> 2. cmd_per_lun describes 'the actual value to be used is the
> minimum of cmd_per_lun and the virtqueue size'.
> 2.1. Does this mean that devices can reject concurrent commands
>      above min(cmd_per_lun, virtqueue_size)?

Yes (with TASK SET FULL status).  Though if virtqueue_size < 
cmd_per_lun, the driver actually won't have room to queue more than 
virtqueue_size items.

> 2.2. Do you really mean 'virtqueue_size'? At minimum a command
>      requires at least 2 entries in the virtqueue. Should this
>      minimum be virtqueue_size / 2?

Not if you use indirect descriptors.

>>> Device operation: requestq
> 1. When a transport returns VIRTIO_SCSI_S_BUSY, can we specify that a
>    guest should retry the request? This would simplify device
>    implementations in the face of resource limitations and would
>    allow guests to control I/O queueing.

That usually makes sense, but it does not have to be that way.  For 
example, under Linux you can mark a request as "failfast" and avoid the 

> 2. When a target is hotunplugged with I/O inflight, can we specify which
> error response will be returned for the now-terminated I/Os?

Either I/O is completed, or it is already documented to be ILLEGAL 

>>> Device operation: controlq
> The ordering of Task Management Function completion with
> respect to requests they are acting on is unspecified. However
> SCSI midlayers require TMF commands complete _after_ the command(s)
> they are aborting/reseting.

I and Venkatesh sorted this out on the upstream linux-scsi mailing list 
for the abort case.

The ordering of completing TMFs vs. requests are now documented, but the 
Linux driver messed up this case.

>    This requires a device ensure ordering between the controlq and
>    requestq processing; for TMF RESET, this means a reset must
>    drain all the request queues (searching for undispatched
>    commands; QEMU does not do this currently and can corrupt guest
>    memory in the worst case).

I think this cannot happen on QEMU if the commands are undispatched 
_and_ the doorbell register has been written, since QEMU is basically 
single-threaded.  If the doorbell register has not been written to, the 
driver is probably buggy (sending a reset and a command at the same time 
is probably not a good idea).

> 1. If we could have a feature flag (VIRTIO_SCSI_F_TMF_ON_REQUESTQ)
>    that allowed TMF commands to be sent down the requestqueue,ordering
>    would be naturally enforced and devices would save a lot of complexity.

This would be too late for 1.0.  I'm also not convinced it is a good 
idea, for if the request queue is full you cannot send TMFs to abort 
commands.  Also, the virtio-scsi standard does not document how you use 
multiple request queues, and multiple request queues would have the same 
ordering problems as the separate control queue.

> 2. If that is not possible, a guest driver can cycle a
>    no-op command through request queue(s) before aborting/resetting
>    a command. To do this, we need to codify a safe no-op command.
>    We could use a command w/ lun[0] = 0x0 as a safe no-op command.
>    This is currently the case for QEMU, vhost-scsi, and GCE. We would
>    like to have this formalized.

I think it is also too late for this.  It is a safer and smaller change, 
but I'm not sure what the properties of the no-op command would be (e.g. 
with respect to ordering) so I'm afraid of missing some important detail.

A REPORT LUNS command should work well as a no-op.  Or we could document 
that the target SHOULD implement the REPORT LUNS well-known LUN (C1/01), 
and then use a TEST UNIT READY command to that LUN.

The C1/01 well-known LUN would be allowed in addition to 01/tgt/xx/yy 
format.  Since it's a SHOULD, it doesn't even require a feature bit.  I 
sent a patch for that.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]