cpal
cpal copied to clipboard
ALSA: invalid handling of capture overruns
I'm working on an application using cpal
over ALSA to continuously capture audio stream on Raspberry Pi. I use both RPI and PC for testing. Note: I'm using a version from master branch, as I need both i16
and i32
sample formats, with i32
not being supported in stable version.
Unfortunately I'm experiencing odd behavior upon capture overruns (which I trigger setting buffer size too low).
- The most severe case is when
poll()
returnEPOLERR
which is not handled properly.
476794 1669022562.665530 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
476794 1669022562.666440 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f6b10495f88) = 0
476794 1669022562.666532 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = 0
476794 1669022562.666577 ioctl(5, SNDRV_PCM_IOCTL_READI_FRAMES, 0x7f6b10496760) = 0
476794 1669022562.666707 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
476794 1669022562.669590 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
476794 1669022562.669765 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
And then I no longer get any more callbacks from cpal
, but my application is eating up 100% of CPU core, as poll()
is executed in a tight loop.
The problem is here:
https://github.com/RustAudio/cpal/blob/7776c6634a8e21e2a585358545141fe091c38320/src/host/alsa/mod.rs#L746-L753
This match doesn't recognize POLLIN|POLLERR
and returns PollDescriptorsFlow::Continue
, causing the worker thread to poll()
again immediately, leading to a tight loop.
The problem seems obvious, I'll try to come with a pull request.
- Another scenario is when the second
ioctl
call fails.
476936 1669022662.246764 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
476936 1669022662.247721 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f70e2810f88) = 0
476936 1669022662.247884 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = -1 EPIPE (Broken pipe)
476936 1669022662.248019 ioctl(5, SNDRV_PCM_IOCTL_PREPARE, 0) = 0
476936 1669022662.248112 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.448722 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
476936 1669022662.449074 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.649849 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
476936 1669022662.650342 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.851166 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
In this scenario I no longer get any data callbacks, instead I get error callback periodically (poll timeout).
I believe the problem is here:
https://github.com/RustAudio/cpal/blob/7776c6634a8e21e2a585358545141fe091c38320/src/host/alsa/mod.rs#L604-L609
The call to prepare()
passes OK, but the stream is stopped - according to ALSA, this call changes the state to SND_PCM_STATE_PREPARED
, there also need to be a call to start()
to transition to SND_PCM_STATE_RUNNING
and re-enable data flow.
I'm a bit confused here about what is the intended cpal
behavior here:
- should I assume that stream is broken when my error callback is called for the first time, drop the
Stream
and create a new one? In such case I believecpal
should do some cleanup (e.g. stop the polling loop). - should I assume that the error is temporary and the data flow will continue? In such case
start()
call is missing.
- The third scenario is when
ioctl(SNDRV_PCM_IOCTL_READI_FRAMES)
fails withEPIPE
:
476728 1669022521.635527 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
476728 1669022521.636362 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f8b2368af88) = 0
476728 1669022521.636521 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = 0
476728 1669022521.636681 ioctl(5, SNDRV_PCM_IOCTL_READI_FRAMES, 0x7f8b2368b760) = -1 EPIPE (Broken pipe)
476728 1669022521.636777 write(1, "Die! A backend-specific error has occurred: ALSA function 'snd_pcm_readi' failed with error 'EPIPE: Broken pipe'\n", 113) = 113
476728 1669022521.636823 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
476728 1669022521.636860 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
476728 1669022521.636892 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
Here, the error from process_input()
is properly delivered to my error callback, but then we enter the tight loop scenario, due to POLLERR
not being recognized.
Just a quick note that I ran in to this issue as-well. I was getting constant ALSA lib pcm.c:8545:(snd_pcm_recover) "overrun" occurred
errors. I changed the code to add the suggested:
PollDescriptorsFlow::XRun => {
if let Err(err) = stream.channel.prepare() {
error_callback(err.into());
}
if let Err(err) = stream.channel.start() {
error_callback(err.into());
}
continue;
}
And it seems to work fine now. I still get the overruns, but it gracefully continues. I am running on a raspberry PI zero w with the respeaker 2mic board attached.
I have moved away from CPAL altogether. My application only needs to work on Linux (Windows was a nice-to-have feature), and I only need a simple record capability (and I also use Respeaker!).
I decided to switch to raw alsa, which solved all my problems. I was initially afraid of the low-level interface, but it turned out surprisingly easy.