virtio-comment message

Subject: Re: [virtio-dev] virtio-snd comments/questions

From: Anton Yakovlev <anton.yakovlev@opensynergy.com>
To: Radu Ocica <rocica@blackberry.com>, "virtio-comment@lists.oasis-open.org" <virtio-comment@lists.oasis-open.org>
Date: Wed, 29 Nov 2023 16:20:51 +0900

Hello Radu,

Thanks for your detailed comments on the sound device specification, and sorry
for the late reply.


On 19.10.2023 00:32, Radu Ocica wrote:

Hi,
I am working on implementing a virtio-snd vdev.During my work on differentaspects of the implementation I stumbled across the following issues with thevirtio sound device specification (that is part of the virtio 1.3 draft). Insome cases additional clarifications might be enough to address the issue, inothers additional features might be required. Below is a list of the issues,with some suggestions on how the issues could be addressed.
*PCM playback and capture*
*Unclear semantics of PCM_STOP/PCM_START operations*
The virtio-snd specification is not clear on the expectations on how thePCM_STOP control request is to be handled by the vdev, in terms of whether to:
  * drop the remaining buffered audio in the host pcm stream and return
    pending io messages to the guest driver
  * drop the remaining buffered audio in the host pcm stream and keep pending
    io messages in order to be able to prime the host pcm stream in case
    PCM_STOP is followed by PCM_START; return pending io messages only when
    receiving PCM_RELEASE
  * drain the remaining buffered audio in the host pcm stream and send a
    PCM_STOP response upon drain completion, return pending io messages to the
    guest driver as they are completed (before PCM_STOP response); drain
    capability might or might not be available for the host pcm stream
  * start draining the remaining buffered audio in the host pcm stream and
    send a PCM_STOP response immediately, return pending io messages to the
    guest driver as they are completed (after PCM_STOP response); if receiving
    PCM_RELEASE before drain completion hold off the response until all
    pending io messages are completed and returned to the guest driver;
    drainÂcapability might or might not be available for the host pcm stream
  * pause the host pcm stream and maintain buffered audio / pending io
    messages as is; pause capability might or might not be available for the
    host pcm stream
If the thinking is that the vdev on the host can decide on how to implementPCM_STOP based on the host stream capabilities, the guest driver has noinformation about the choice made by the vdev. In our opinion the choice madeby the vdev in implementing PCM_STOP dictates what the possible transitionsafter PCM_STOP can be (if PCM_STOP is implemented as pause, PCM_START andPCM_RELEASE are possible transitions, but if PCM_STOP is implemented as drain,the only valid transition is PCM_RELEASE and if PCM_STOP is implemented asdrop and all pending io messages have been returned to the guest driver, againthe only valid transition seems to be PCM_RELEASE), but the guest driver isthe one issuing the next request and it might at the best be able to make aneducated guess how the vdev implemented PCM_STOP based on whether all thepending io messages are returned or not to the guest driver. To make such aguess possible, more clarifications would be required in the spec to identifywhich of the scenarios listed above are considered by the spec.


In general, the specification describes only the interface between the device
and the driver and does not (should not) describe any implementation details.
So you can choose any of the listed options depending on your needs or backend
capabilities. Also keep in mind that the ALSA driver in Linux is only a
special case, and that there are also non-ALSA systems (Windows, etc). So it
would be wrong for a device to rely on the behavior of a particular driver
variant.

But it looks like the following points could be improved:

  1. As you proposed, make pause to be an explicit operation: add a new stream
     feature and two new control requests.

  2. Clarify the state machine: it seems that instead of allowed transitions
     (of which there can be many), it's better to list only invalid ones.

Note 1:Â The reference virtio-snd linux/android guest driver explicitlydoesn't support the ALSA SNDRV_PCM_INFO_DRAIN_TRIGGER feature flag, whichmight indicate there was no intent to support drain semantics.


Because it's not entirely clear that this would make sense.

In the current implementation, when ALSA drains a stream, it simply waits for
the device to process all pending periods, and then issues PCM_STOP. I.e. the
device continues to operate as normal, it just does not receive explicit
notification in the process (as the specified flag suggests).

But perhaps if drain was also made an explicit request, it could help
distinguish between the drain/drop case for the PCM_STOP request.

Note 2: The reference virtio-snd linux/android guest driver explicitly doessupport the ALSA SNDRV_PCM_INFO_PAUSE feature flag, which seems to indicateintent to support pause semantics, along with drop semantics

Because, when discussing the draft specification, it was decided that
PCM_STOP -> PCM_START -> ... sequence fits well in this case. But, yes, as
practice has shown, this only complicates things.

Note 3: The Android Audio HAL and tinyalsa/tinyplay don't have any notion ofdrain or pause. At the end of a playback session there is a wait forpreviously buffered data to complete playing using the tinyalsa pcm_wait()function, and then a PCM_STOP request is issued after the buffered audio hascompleted playing. Drop semantics are assumed for PCM_STOP in this context.


Not entirely true, because in the case of tinyalsa, immediately after PCM_STOP
comes PCM_RELEASE. And if PCM_STOP implements a "pause", then a "drop" can be
performed in a subsequent PCM_RELEASE. I.e. no difference with the typical
case for Linux application.

*Information on period alignment and max audio buffer size*
No information about period alignment and min/max audio buffer/period size iscurrently available in struct virtio_snd_pcm_info. Ideally the period sizeused for io messages should match the period size used on the host. That willguarantee consistency in timing, precise accounting of available room to writemore data or available data to read, across host and guest. A workaround ispossible, where the device will fail a PCM_SET_PARAMETERS request thatspecifies an unsupported period alignment or audio buffer/period size and theguest is forced to use a period and buffer size that is known to be supportedby the host device (this can be done in an Android guest, by hard coding theandroid audio HAL period size and the number of periods and thus effectivelythe audio buffer size that is used for the virtio sound devices in the guest;in android and linux guests it can be done also by setting the virtio_sndmodule parameters, to the effect of locking down the period and buffer size).


The specification, although based on ALSA, is not a para-virtualization of
ALSA. And what you describe is exactly that.

And, unless the backend implements a pure ALSA client, this has no practical
meaning. Usually the backend works through some kind of audio server, and
there is no way to find out the real size of the hardware buffer and period.
Consistency in timing for reading/writing is already guaranteed by these
servers, since they also need to get everything done on time.

*General Issues*

  * There is no statement regarding the endianness of PCM data exchanged via
    txq and rxq queues and the virtio sound PCM format definitions do not
    include endianness.


Virtio standard is little-endian. Thus, all sample formats are also LE. So, no
clarification is needed.

  * There is currently no requirement for the guest driver to enqueue an io
    message only after it has completed writing data to it (playback) or
    reading data from it (capture). The linux/android guest driver currently
    enqueues an io message for playback immediately after calling
    snd_pcm_period_elapsed(). snd_pcm_period_elapsed() makes it possible for
    the guest client app to write new data, but upon the return of this
    function there is no guarantee that the guest client app has already
    written new data. If the vdev dequeues the io message immediately and
    copies the PCM data for that io request before the guest client app has
    actually written new data to it, it results in out-of-sequence/stale data
    being played back.


Yes, these was a bug in the driver which already got a fix.

  * There is no statement how the guest driver should react to a control

request failing.

      o For PCM control requests the spec details what the "possible
        transitions" are after each control request. It doesn't specify
        whether this applies only to successful control requests, or
        irrespective of the success of the control requests. For instance for
        PCM_START only PCM_STOP is listed as a valid transition. What happens
        if PCM_START fails though? Is PCM_STOP still a valid transition? Is a
        repeat of PCM_START valid, since the previous PCM_START failed? Is a
        PCM_RELEASE required in order for the guest driver to recover the io
        request descriptors that were sent to the vdev before PCM_START? Is a
        PCM_STOP required in this case, in order to be able to send
        PCM_RELEASE (because of valid sequences considerations)? It was seen
        that the reference linux/android guest driver ignores a PCM_START
        failure and still sends io requests after the failed PCM_START.


I/O requests may be sent only prior to PCM_START. But buffers are sent via TX-
queue, while control request is sent via control-queue. Thus, depending on
timings (vmexit's, etc), these notifications may arrive in different order.

Therefore, it makes no sense to require the driver not to send I/O messages in
the case of failed PCM_START, since this requirement cannot be implemented in
practice.

Still, the following control sequence is always valid (as already stated in
the spec):

PCM_PREPARE -> PCM_START -> PCM_STOP -> PCM_RELEASE

      o Should a failed control request be retried with a maximum retry count?
        Here the danger is that retrying forever without any wait in between
        retries might result in a high CPU load. Should there be a dedicated
        status value that tells the guest not to retry a failed request (or to
        use a error handling sequence involving RELEASE and PREPARE)?


No, it's not needed. In general, applications in the userspace try to restart
the stream themselves in case of an error.


  * There is no statement how the guest driver should react to an io message

failing

      o Should it stop sending io messages and just send PCM_STOP? Or retry
        sending io requests hoping that they will succeed? Should there be a
        maximum retry count? Here the danger is that retrying forever without
        any wait in between might result in a high CPU load.


There is no statement because this is already an implementation detail that is
not in the scope of the specification. The spec describes only interface.


  * The spec seems to indicate that PCM_PREPARE after PCM_STOP or
    PCM_SET_PARAMS after PCM_STOP would be invalid, still the linux/android
    guest driver are sending controlÂ requests in these sequences (seen when
    playing back from android UI and PCM_START handling is modified to fail
    for the sake of testing the behavior). Is the spec out of date? Is the
    reference linux/guest android driver not compliant to the spec? Are these
    sequences made possible by the fact that the PCM_START previous to
    PCM_STOP actually failed? If these are valid sequences, the virtio spec
    should be updated to show them as valid.


Yes, we need to re-check the state machine once again.

  * The virtio spec mentions that for playback, between PCM_PREPARE and
    PCM_START the guest driver should send io messages to prime the host
    playback stream. There is no statement what shall be done wrt io messages,

in the following situations:

      o between PCM_STOP and PCM_START: this goes back to the semantics of
        PCM_STOP: if PCM_STOP had drop semantics, all outstanding io messages
        could be returned to the guest; if PCM_STOP had pause semantics,
        outstanding io messages could be kept on the device side and returned
        to the guest only upon the receipt of a PCM_RELEASE request from the
        guest; if PCM_STOP had drain semantics, the outstanding io messages
        could be returned to the guest as they are completed
      o after a failed PCM_START: since PCM_START failed, all io messages sent
        by the guest after PCM_PREPARE could be returned to the guest; or this
        could be deferred to a PCM_STOP or PCM_RELEASE request


No clarification for the I/O messages is need here. But we need to rethink
pause and drain semantics.


  * There is no way for the vdev to notify the guest driver that there is an
    error condition that requires a new PCM_PREPARE in order to clear. The
    vdev could send the VIRTIO_SND_EVT_PCM_XRUN, which requires PCM_PREPARE to
    clear, but this would be abusing the spec, as VIRTIO_SND_EVT_PCM_XRUN is
    meant exclusively for XRUN errors.


Why can't such an exceptional error be a variant of XRUN? If the stream must
be restarted, then it cannot continue writing/reading I/O messages. Which, in
turn, will naturally lead to XRUN state on the driver side.

  * There is no way for the vdev to notify the guest driver that a stream is
    disconnected (e.g. host audio device has been removed - think USB audio
    device - or reinserted).


Such things should be entirely within the purview of the specific device
implementation. The driver has nothing to do with this.

Even if the device could send such notifications, what could the driver do in
this case?  It can only set XRUN status for this stream. And if so, then the
device can already do this.


*Recommendations:*

  * Add explicit statement about endianness of PCM data exchanged via txq and
    rxq queues.
  * Add explicit requirement for the guest driver to enqueue an io message
    only after it has completed writing data to it (playback) or reading data
    from it (capture)
  * Add explicit requirement for control and io message failures to be handled
    on the guest driver side. It would be expected that if PCM_START fails,
    for instance, the guest driver doesn't send any further io messages. For
    io messages, at the minimum the guest driver should use a count of
    successive failures and stop retrying. If the handling for a failed
    control request is retrying the same request, there should be a limited
    number of retries. Specific to PCM_START failure, it should be specified
    whether PCM_STOP and maybe PCM_RELEASE and PCM_PREPARE is required before
    attempting any PCM_START retry. The behavior could be specified as
    dependent on whether the vdev has returned the pending io messages when
    handling PCM_STOP (if all pending io messages returned at PCM_STOP,
    perform PCM_RELEASE and PCM_PREPARE before retrying).
  * Add clarification for handling of oustanding io messages on the device
    side upon a failed PCM_START control request, or upon a PCM_STOP control
    request
  * Update valid transitions for each PCM control request, if specification
    out of date (see above for control request sequences seen with the
    reference linux/android guest driver, that are currently not specified as
    valid in the virtio sound specification)
  * Add explicit statement that the documented valid transitions for PCM
    control requests are for successful control requests and in the case of
    failed requests, the valid transitions are same as before the failed
    control request (if this statement is correct, if not clarify what the
    expectation is)
  * It would be beneficial if the virtio-snd specification did not leave the
    interpretation of PCM_STOP semantics to the vdev (drop vs pause vs drain),
    without a clear contract between vdev and guest driver
    If the spec intends to allow the host to use both pause/resume and drop/go
    (and maybe even drain/go) semantics for the START/STOP operations, then
    for clarity it should use a separate request (PCM_PAUSE) for pause and if
    drain semantics are to be included as well, then a separate request
    (PCM_DRAIN) for drain as well. PCM_START could be used to restart/resume
    in all cases. Ideally there should be the following optional stream features:

      o VIRTIO_SND_PCM_F_PAUSE - if offered by the vdev it indicates support
        for pause-resume both in the vdev and the host pcm stream; new request
        VIRTIO_SND_R_PCM_PAUSE with valid transitions START or RELEASE; START
        would have a valid transition to PAUSE;
      o VIRTIO_SND_PCM_F_DRAIN - if offered by the vdev it indicates support
        for drain both in the vdev and the host pcm stream, new request
        VIRTIO_SND_R_PCM_DRAIN with valid transitions RELEASE; START would
        have a valid transition to DRAIN;
      o VIRTIO_SND_R_PCM_STOP would retain drop semantics and have possible
        transitions RELEASE;
      o A failed VIRTIO_SND_R_PCM_START would have the only possible
        transition RELEASE;
      o Pending io messages would only need to be returned to the guest driver
        when handling VIRTIO_SND_R_PCM_RELEASE
      o For priming the host pcm stream, io messages would only need to be
        sent by the guest driver between VIRTIO_SND_R_PCM_PREPARE and
        VIRTIO_SND_R_PCM_START
      o The above approach would highly simplify the spec in terms of
        documenting possible transitions and when to send io messages to prime
        the host pcm stream and when to return pending io messages to the
        guest driver, as opposed to overloading the use of PCM_STOP for
        drop/pause(/drain), when the behavior wrt io messages would highly
        depend on the host interpretation of PCM_STOP.

  * It would be beneficial to add two new PCM notifications, via the following

stream features:

      o VIRTIO_SND_PCM_F_EVT_ERR - indicates support for new notification
        VIRTIO_SND_EVT_PCM_ERROR, that signals to the guest driver that en
        error occurred (other than structly underrun/overrun) while handling
        io messages, requiring a sequence of PCM_RELEASE, PCM_PREPARE,
        PCM_START to recover
      o VIRTIO_SND_PCM_F_EVT_CONN_STATE - indicates support for new
        notifications VIRTIO_SND_EVT_PCM_DISCONNECTED and
        VIRTIO_SND_EVT_PCM_CONNECTED, to indicate to the guest driver pcm
        stream disconnected (e.g. USB device removed) and pcm stream connected
        (e.g. USB device re-inserted)

  * It would be beneficial to include Information on period alignment and max
    buffer size in struct virtio_snd_pcm_info, available via optional feature
    flag, e.g. VIRTIO_SND_PCM_F_PERIOD_INFO and the additional info would be
    added at the end of struct struct virtio_snd_pcm_info, replacing the
    padding bytes and adding to the length of the struct - new le32 fields
    period_align_bytes, min_period_bytes, max_period_bytes, max_buffer_bytes,
    where min_period_bytes, max_period_bytes and max_buffer_bytes are all
    multiples of period_align_bytes. With this addition, the guest driver
    would have the information required to match the period/buffer size
    capabilities of the host pcm stream and advertise them to clients in the
    guest space, without the need to hard code these using the guest
    virtio_snd module parameters.
  * Alternatively a new control request to query the refined PCM parameters
    could be added (e.g. PCM_GET_PARAMS, via PCM feature flag
    VIRTIO_SND_PCM_F_GET_PARAMS), to be used by the guest driver after it has
    successfully sent PCM_SET_PARAMS, in order to query the PCM parameters
    effectively used by the device.

*Audio Controls*

The original virtio-snd device specification as added to virtio 1.1 did notinclude audio control support, however the virtio 1.3 draft includes this.

*Audio Control Roles*

The roles describe standard uses for controls, like volume, mute and gain.There is also an undefined role definition that according to the spec must notbe used.


  * VIRTIO_SND_CTL_ROLE_UNDEFINED
  * VIRTIO_SND_CTL_ROLE_VOLUME
  * VIRTIO_SND_CTL_ROLE_MUTE
  * VIRTIO_SND_CTL_ROLE_GAIN

Problems identified with the role definitions:

  * Not all audio controls can be described as volume, mute or gain. The
    undefined role must be allowed for any control that is not for volume,
    mute or gain.


The role field is intended to support non-ALSA drivers. Usually such drivers

only need hardware volume control (and do not need complex ALSA controlmachinery), then instead of parsing control names they can quickly check the

value of this field.

ALSA driver can completly ignore it.

  * What is the difference between VIRTIO_SND_CTL_ROLE_VOLUME and
    VIRTIO_SND_CTL_ROLE_GAIN, is the first supposed to be used for playback
    and the second for capture? Or is the first supposed to be used for
    digital volume and the second for analog gain? Is a distinction between
    volume and gain necessary?


Yes, ROLE_VOLUME is for playback and ROLE_GAIN is for capture.

  * If certain data types are assumed for controls with role
    VIRTIO_SND_CTL_ROLE_VOLUME, VIRTIO_SND_CTL_ROLE_MUTE or
    VIRTIO_SND_CTL_ROLE_GAIN, this should be explicitly specified (e.g.
    BOOLEAN for VIRTIO_SND_CTL_ROLE_MUTE, INTEGER for
    VIRTIO_SND_CTL_ROLE_VOLUME and VIRTIO_SND_CTL_ROLE_GAIN?).


No, there is no such restriction.

*Mute control value meaning*
The virtio spec is not clear whether a value of 1 for an audio control withrole VIRTIO_SND_CTL_ROLE_MUTE means muted, or rather switched on (unmuted).Testing end-to-end with the reference virtio-snd linux/android guest driver itwas inferred that 1 means switched on, i.e. unmuted. In our opinion thisshould be clarified within the spec.


The spec describes only the interface for providing controls to the guest.

The device implementation may provide a control that will work like one of the
standard ALSA controls (for example, 'Master Playback Switch'). In this case,
the author of the device should go read the relevant ALSA documentation.

Or the device implementation may provide some kind of non-standard control
with a custom âprotocolâ. Then the author of the device should adapt the guest
applications to use such a control.

*TLV support*
The TLV struct and possible type values are considered well known from ALSA.
There are only a few common/standard TLV type definitions (for specific metadata types, like db scale, db min/max plus a handful more or the container TLVtype that allows packing multiple TLVs into one containing TLV); they shouldbe defined in virtio-snd.h for the completeness of the specification.Otherwise a non ALSA based host that supports reporting metadata would need toredefine these to match the ALSA definitions


As mentioned in the previous comment, the spec refers to the ALSA docs for
this, that's enough.

One of the main problems with redefining from external sources is that if
there are any changes, then each such change will have to be reflected in this
sound device specification as well.

The audio controls extension proposal draft includes the definition of structvirtio_snd_ctl_tlv. It is removed however in the virtio_snd.h header file ofthe linux/android virtio-snd guest driver implementation for this extension -the guest driver just performs memcpy to/from the equivalent ALSA struct. Forthe completeness of the specification all data structures used for dataexchanges within the virtio spec should be part of the public header file.This is particularly important when the host or guest don't use ALSA and don'thave access to the ALSA TLV struct definition.Other than for metadata of controls for volume/gain, in ALSA audio controlscan use TLVs for representing the actual control data (such controls use typeVIRTIO_SND_CTL_TYPE_BYTES and access VIRTIO_SND_CTL_ACCESS_TLV_READ,VIRTIO_SND_CTL_ACCESS_TLV_WRITE or a combination ofVIRTIO_SND_CTL_ACCESS_TLV_READ and VIRTIO_SND_CTL_ACCESS_TLV_WRITE). Sincethis use seems to be supported by the reference linux/android virtio-snd guestdriver, it should be documented in the spec.


The same comment as above.

*virtio_snd_ctl_info struct alignment*
Field "name" of the virtio_snd_ctl_info struct has a length of 44 bytes. Withthis field length, the alignment of the following value union field with__le64 data members depends on packing and/or alignment compiler directivesthat can differ between host and guest and thus result in a differentalignment of "value". Klocwork detects this issue as a PORTING.STORAGE.STRUCTguideline violation for this struct, which can be fixed by modifying thelength of field "name" to 48, or by adding a padding field of length 4 bytesbetween the "name" and "value" fields (e.g. __u8 padding[4]).


To be honest, I don't understand where the problem is here.

In the current definition, __le64 fields are already located on an 8-byte
boundary, the preceding "index" field is located on a 4-byte boundary, which
should already satisfy the requirements of native alignment. And "name" fits
fine between them as a byte array.

I mean, if you make it 48-byte size, then there should be a 4-byte hole
between "index" and "name", no? And if so, it would just reproduce the same
issue.


Kind regards,

Looking forward to clarifications and/or responses.
Radu Ocica

------------------------------------------------------------------------------
This transmission (including any attachments) may contain confidentialinformation, privileged material (including material protected by thesolicitor-client or other applicable privileges), or constitute non-publicinformation. Any use of this information by anyone other than the intendedrecipient is prohibited. If you have received this transmission in error,please immediately reply to the sender and delete this information from yoursystem. Use, dissemination, distribution, or reproduction of this transmissionby unintended recipients is not authorized and may be unlawful.


--
Anton Yakovlev
Senior Software Engineer

OpenSynergy GmbH
Rotherstr. 20, 10245 Berlin