semu icon indicating copy to clipboard operation
semu copied to clipboard

Implement VirtIO sound device

Open Cuda-Chen opened this issue 1 year ago • 10 comments

Currently, semu lacks of sound playing feature.

To implement, we can use VirtIO sound with ALSA architecture.

Cuda-Chen avatar Jul 19 '24 14:07 Cuda-Chen

we can use VirtIO sound with ALSA architecture.

Can you illustrate the progress and the potential integration considerations?

jserv avatar Aug 22 '24 15:08 jserv

Hi @jserv ,

For the progress:

  • The system can boot up and the ALSA driver will be registered (using make check but not sudo make check as it will complains incorrect parameter related errors).
  • The host system can see the ALSA driver in system settings (named ALSA plug-in [semu] in Volume Levels).

For potential integration considerations:

  • We should install the ALSA utilities such as alsa-utils for testing the sound device.
  • TinyALSA can be used if semu is aiming for a lightweight emulator.

Cuda-Chen avatar Aug 22 '24 23:08 Cuda-Chen

Hi @jserv ,

For the supporting operations mentioned in https://github.com/sysprog21/semu/pull/53, to let semu plays sound I consider it requires to support more operations (and the operations are mentioned in VirtIO official document), should we investigate then list the operations that have to be implemented to support the common sound operation (e.g., playing sound, querying sound device information, etc.)?

Cuda-Chen avatar Sep 08 '24 09:09 Cuda-Chen

should we investigate then list the operations that have to be implemented to support the common sound operation (e.g., playing sound, querying sound device information, etc.)?

Yes, go ahead.

jserv avatar Sep 08 '24 14:09 jserv

For this issue, I am going to implement VirtIO sound device supporting these operations:

  • querying the device information
    • VIRTIO_SND_R_PCM_INFO
    • VIRTIO_SND_R_CHMAP_INFO
    • VIRTIO_SND_R_JACK_INFO
  • playing sound (PCM, specifically)
    • VIRTIO_SND_R_PCM_SET_PARAMS
    • VIRTIO_SND_R_PCM_PREPARE
    • VIRTIO_SND_R_PCM_RELEASE
    • VIRTIO_SND_R_PCM_START
    • VIRTIO_SND_R_PCM_STOP

Cuda-Chen avatar Sep 10 '24 07:09 Cuda-Chen

Update:

  • Sound Card
    • After investigation, the VirtIO SoundCard at platform/f4400000.virtio/virtio2 is actually attached to the kernel.
    • However, I can only see controlCX (X stands for the card number of VirtIO SoundCard) in /dev/snd/.
    • Compared to LoopBack, which creates the following files in /dev/snd/:
      • controlCX
      • pcmCXDYc
      • pcmCXDYp
  • PCM
    • As the VirtIO SoundCard does not expose any endpoint for PCM playback, there is no way to play sound via VirtIO SoundCard.

I am going to solve the sound card endpoint issue first.

Cuda-Chen avatar Oct 06 '24 03:10 Cuda-Chen

Update:

  • Sound Card
    • The VirtIO Sound Card can be brought up after initialization.
    • I can see the pcmCXDYc appears in /dev/snd/, and aplay -l can list the VirtIO Sound Card.
  • PCM
    • speaker-test and aplay exit with non-zero return value because an unknown type (0x0) is sent from driver to virtio-snd after pcm_prepare state.

Cuda-Chen avatar Oct 10 '24 13:10 Cuda-Chen

Statue update:

  • PCM
    • After investigation, semu VirtIO device receive the data from TX queue after VIRTIO_SND_R_PCM_PREPARE state.
    • I am going to implement to receive the PCM frames from TX queue to buffer so that we can play the sound.

Cuda-Chen avatar Oct 26 '24 03:10 Cuda-Chen

  • I am going to implement to receive the PCM frames from TX queue to buffer so that we can play the sound.

Do you think whether if single-threaded queue manipulation is enough. I am not sure that such TX queue can be operated without extra threads.

jserv avatar Oct 26 '24 15:10 jserv

Do you think whether if single-threaded queue manipulation is enough. I am not sure that such TX queue can be operated without extra threads.

For my current findings, qemu and rust-vmm do not use any extra threads to operate TX queue. However, we may consider using extra threads as it seems we have to notify the device to complete transmission once it gets PCM frame from TX queue.

Cuda-Chen avatar Oct 27 '24 09:10 Cuda-Chen

Statue update:

  • PCM
    • Findings: the driver sends arbitrary number of PCM frames when transferring data.
    • I will implement a data structure which holds these arbitrary number of frames (e.g., queue or scatter list).

Cuda-Chen avatar Nov 02 '24 07:11 Cuda-Chen

Statue update:

  • PCM
    • The queue holding arbitrary number of PCM frames is implemented.
    • The driver (guest Linux OS) still sends the PCM frames after the device receives the PCM frames. It seems that we have to send some signal to inform the driver that the device has received the PCM frames and it should send pcm_start to play the sound.

Cuda-Chen avatar Nov 18 '24 02:11 Cuda-Chen

Update:

  • PCM
    • Create a dedicated thread to handle TX.
    • The pcm_release signal is sent asynchronously.

Actions:

  • PCM
    • Create a lock in pcm_start/pcm_stop state.
    • Check the possible implementation to handle pcm_release state.

Cuda-Chen avatar Dec 02 '24 02:12 Cuda-Chen

  • PCM
    • Create a lock in pcm_start/pcm_stop state.
    • Check the possible implementation to handle pcm_release state.

@idoleat, can you comment the recent work of #53 ?

jserv avatar Dec 02 '24 03:12 jserv

  • PCM

    • Create a lock in pcm_start/pcm_stop state.
    • Check the possible implementation to handle pcm_release state.

@idoleat, can you comment the recent work of #53 ?

Hi @jserv ,

I thought you may tag the wrong person. So should I conclude the recent work of #53?

Cuda-Chen avatar Dec 02 '24 07:12 Cuda-Chen

I thought you may tag the wrong person.

No, I was intentionally referring to him.

should I conclude the recent work of #53?

Go ahead.

jserv avatar Dec 02 '24 07:12 jserv

should I conclude the recent work of https://github.com/sysprog21/semu/pull/53?

Go ahead.

Hi @jserv , I am going to summarize the recent work in this comment.

Past

  • Initialize VirtIO sound device.
  • Implement a thread for receiving PCM frames sent from driver.

Ongoing

  • Implement pcm_release state.
  • Implement a ring buffer to store the PCM frames for playback.

Cuda-Chen avatar Dec 02 '24 09:12 Cuda-Chen

For my clarity, I would like to confirm the workflow works as:

  • The driver (guest Linux OS) prepares PCM frame data in memory as a descriptor
  • The driver (guest Linux OS) signals the TX thread to take the descriptor and tell audio server (PulseAudio/Pipewire) where the PCM frame data is as a audio client
  • TX thread interrupts the driver (guest Linux OS) that it has instructed audio server to play the PCM frame data so it can be cleaned up/overwritten in memory

I have some concern on the third step since I see TX thread send interrupt right after finishing step 2 (correct me if I'm wrong). Typically, sound card sends interrupt every period and ALSA driver polls that interrupt to update hardware pointer for the buffer. If the interrupt is sent right after step 2, audio server may have nothing to play because the driver (guest Linux OS) thinks it has already been played. The driver may think audio under-run is happening and may address it.

From my understanding, the length of period normally depends on the capability of sound card hardware and buffer size. So we may use the length that reflects the real hardware. Or we may adjust it for other reasons like batch/fine-grained submission, as long as it won't cause under/over-run or long latency. For sending interrupt every period we may use SIGALRM? It could be too tedious to hook into real hardware interrupts. I've read comments above saying that the driver may prepare arbitrary size of PCM frame data. So to introduce period, we may need to tidy PCM frame data as chunks per period.

Edit: fix typo

idoleat avatar Dec 18 '24 17:12 idoleat

Hi @idoleat , I would like to make some statements of the workflows you are concerning for, all based on my understanding of VirtIO standard. So if you think there are still some rooms of uncertainty, just let me know. I will make my observations for interacting with VirtIO standard and VirtIO sound driver (namely, guest Linux OS with Linux Kernel version 6.1) in each list and some statements you have mentioned.

Before we begin, let's look about the Virtqueues that a VirtIO sound device will use for communicating with VirtIO sound driver (from Virtqueue section in sound device):

  1. controlq: sending control messages from driver to device.
  2. eventq: sending notifications from device to driver.
  3. txq: sending PCM frames for output streams (i.e., playback).
  4. rxq: sending PCM frames for input streams (i.e., capture).

Then, let me reply the list you have mentioned.

  • The driver (guest Linux OS) prepares PCM frame data in memory as a descriptor

By the VirtIO standard, the driver prepares PCM frames in memory, then sends these PCM frames via txq to the device.

  • The driver (guest Linux OS) signals the TX thread to take the descriptor and tell audio server (PulseAudio/Pipewire) where the PCM frame data is as a audio client

This is not mentioned in the VirtIO standard. For my implementation, as we are using MMIO, I use a TX thread to receive PCM frames sent from the driver so that the main thread of device won't just hang for merely receiving PCM frames. For my finding, other publicly available implementations such as qemu and rust-vmm do not use a TX thread for receiving PCM frames, and I guess the reason is that they use PCI interrupt mechanism. What's more, as we can select the PCM features according to 5.14.6.6.2 VIRTIO_SND_R_PCM_INFO, though it seems I receive PCM frames with values within the range of short datatype, I guess the reason why the host playback plays nothing is because it maybe just send the memory area (using VIRTIO_SND_PCM_F_SHMEM_*).

  • TX thread interrupts the driver (guest Linux OS) that it has instructed audio server to play the PCM frame data so it can be cleaned up/overwritten in memory

For my understanding, there are two ways to interrupt the driver:

  1. PCM I/O messages: in 5.14.6.8 PCM I/O Messages, the standard mentions: the completion of such an I/O request can be considered an elapsed period notification.
  2. PCM notifications: using eventq, as mentioned in 5.14.6.7 PCM Notifications.

I've read comments above saying that the driver may prepare arbitrary size of PCM frame data.

Sorry for my dumb writings, it should be as follows: the driver sends arbitrary number of virtqueue descriptors (for instance of receiving PCM frames from the driver, it may send the virtqueue descriptors like one request, then follow two payloads consisting of PCM frames--the summation size of these payloads will be the size of buffer_bytes set by VIRTIO_SND_R_PCM_SET_PARAMS--and at last, one response.

Cuda-Chen avatar Dec 19 '24 02:12 Cuda-Chen

commit 940a749dcb6aec9508f8a68728fab3e81c0a1fc2 brings the preliminary implementation for VirtIO sound device. However, the macOS support is absent. It might be better to switch to SDL_mixer. See https://github.com/sysprog21/rv32emu/pull/551

jserv avatar Feb 02 '25 02:02 jserv

How about using PortAudio?

jserv avatar Feb 04 '25 07:02 jserv

Hi @jserv ,

As I am going to take a week break, I am going to write something down about my experience of implementation on choosing the sound backend:

  1. cross-platform compatibility (NOQA)
  2. The backend supporting synchronous writing to buffer action will be first considered because VirtIO sound device and driver are isochronous for transferring PCM frames.
  3. For asynchronous writing to buffer, I will prefer the backend supporting start/stop action (which cnfa currently lacks for).

For the backends you have provided, I will test these backends once I am back.

Cuda-Chen avatar Feb 04 '25 13:02 Cuda-Chen

Hi @jserv ,

As I have noted before, I will make a survey of testing these two libraries (SDL-mixer and PortAudio). For the tasks I am going to breaking as follows:

  • [x] test PortAudio
    • works like charm
    • I will choose this as the cross-platform sound backend.
  • [ ] test SDL-mixer

Cuda-Chen avatar Feb 11 '25 12:02 Cuda-Chen

Hi @jserv ,

I find that there are two inconsistencies in pcm_release state according to the VirtIO 1.3 standard in section 5.14.6.6.5.1:

  • The device MUST complete all pending I/O messages for the specified stream ID.
  • The device MUST NOT complete the control request while there are pending I/O messages for the specified stream ID.

Should we fix these issues in the commit altering sound backend or another PR? As I consider fixing these will take more time.

Cuda-Chen avatar Feb 14 '25 06:02 Cuda-Chen

I find that there are two inconsistencies in pcm_release state according to the VirtIO 1.3 standard in section 5.14.6.6.5.1:

Clarify via GitHub issues: https://github.com/oasis-tcs/virtio-spec/issues .

jserv avatar Feb 14 '25 07:02 jserv

I find that there are two inconsistencies in pcm_release state according to the VirtIO 1.3 standard in section 5.14.6.6.5.1:

Clarify via GitHub issues: https://github.com/oasis-tcs/virtio-spec/issues .

Hi @jserv , They're related about our implementation. What I am going to do is to comply the implementation to the spec (i.e., flush the I/O messages in pcm_release state).

Cuda-Chen avatar Feb 14 '25 07:02 Cuda-Chen

Hi @jserv , Let me report the current progress.

I just implement the flush feature, and it actually flushes the I/O queue. I also observe that the sound device can play sound only once, and if you try to play sound again, semu crashes for two situations:

  1. The device receives an invalid stream ID.
  2. The device hangs at pcm_stop state as the sound backend cannot release the lock.

I will try to figure out some solutions this week. If I still can't resolve, I will add a limitation note and just create a PR for merging.

Cuda-Chen avatar Feb 19 '25 04:02 Cuda-Chen

Hi @jserv , I am going to report current progress in this comment.

Findings

The device always fails at pcm_release state after the second round of playing with the following findings:

  1. The driver sends certain invalid stream_id with the following number when writing PCM frames to device (shown in range of power of 2):
  • $2^{24} < 24,314,128< 2^{25}$
  • $2^{27} < 143,198,513< 2^{28}$
  • $2^{29} < 545,791,863 < 2^{30}$
  • $2^{31} < 4293,132,252 < 2^{32}$
  • $2^{28} < 327,488,265 < 2^{29}$
  1. The virtq fails to flush with the following messages:
[   38.595499] virtio_snd virtio3: virtsnd-tx:id 0 is not a head!
[   39.611520] virtio_snd virtio3: SID 0: failed to flush I/O queue

Analysis

In this section, I am going to find the origin of errors with Linux Kernel v6.1.107 (commit 311d8503ef9fa25932825c5342de7213738aad8e).

virtio_snd virtio3: SID 0: failed to flush I/O queue

The error originates from virtsnd_pcm_sync_stop() in sound/virtio/virtio_ops.c. Let's take a look of a potion of the function:

rc = wait_event_interruptible_timeout(vss->msg_empty,
                          !virtsnd_pcm_msg_pending_num(vss),
                          js);
if (rc <= 0) {
        dev_warn(&snd->vdev->dev, "SID %u: failed to flush I/O queue\n",
             vss->sid);

        return !rc ? -ETIMEDOUT : rc; 
}

For my rough guess, we need to flush the virtq before the timeout specified by wait_event_interruptible_timeout()

virtio_snd virtio3: virtsnd-tx:id 0 is not a head!

The error originates from virtqueue_get_buf_ctx_split() in drivers/virtio/virtio_ring.c. Let's take a look of a potion of the function:

if (unlikely(!vq->split.desc_state[i].data)) {
        BAD_RING(vq, "id %u is not a head!\n", i);
        return NULL;
}

Actions

  1. I will try to upgrade the Linux Kernel to check whether the problem resolves.
  2. I will find the root cause why driver sends a weird stream_id after the second round of playing.

Cuda-Chen avatar May 04 '25 10:05 Cuda-Chen

Hi @jserv , I am going to report current progress in this comment.

Upgrade Linux Kernel to v6.7.12

In v6.7.12 (commit fe981e67568c41de6caae25d70b5f203b94452cc), it introduces ack callback to fulfill the spec. After upgrade to this specific version, however, the playback plays the sound with only one period then stuck with aplay: pcm_write:2178: write error: Input/output error. I guess either the driver does not send any subsequent PCM frames or some kind of timer issue in guest OS.

Cuda-Chen avatar May 07 '25 07:05 Cuda-Chen

Hi @jserv , I will report some progresses in these days in this comment.

aplay: pcm_write:2178: write error: Input/output error

After checking, I confirm that there is something funny happens after ALSA (in guest OS) receiving one period of PCM frames using pcm_write() (for one such funny thing: aplay acts normally once randomly). As such, I change the buffer size to more than four times of period size, and the aplay exits without any error, though the sound misses some part due to the setting of buffer size.

I will wait for two days, and if you think there needs nothing for further investigation, I will continue working on PR #76 .

Cuda-Chen avatar May 12 '25 02:05 Cuda-Chen