Sporadically reading sped-up HFP/SCO (oFono, mSBC) audio samples using BlueALSA 3.1.0
First of all, a huge thanks to everyone for your contributions to BlueALSA! 👏 And thank you in advance to anyone who takes the time to read this issue.
I've checked troubleshooting documentation and reviewed all current and past issues. I was unable to find anything that directly applies. I don't expect much explicit help with BlueALSA 3.1.0, but I was mainly hoping this issue may ring a bell for someone -- perhaps it's a known issue?
Problem
We are using BlueALSA 3.1.0 for a GNU/Linux-based HFP and A2DP headset using the Infineon CYW4373E chipset. In most cases, HFP audio (SCO/mSBC) works flawlessly. In some cases, audio samples read from the ALSA device are "sped up", as if the sample rate were 2-3x higher than normal.
On our embedded device, an update to the latest master branch would unfortunately not be a trivial matter, so I cannot be sure if the issue persists on the master branch.
Reproduction steps
We configure our Bluetooth chip and start BlueALSA as follows:
hciattach /dev/ttymxc0 bcm43xx 3000000 &
sleep 1
hciconfig hci0 up
# Configure wide-band audio for UART I/O
# see https://www.cypress.com/file/298741/download
hcitool -i hci0 cmd 0x3f 0x001 0x01 0x02 0x00
hcitool -i hci0 cmd 0x3f 0x01c 0x01 0x00 0x00 0x1 0x01
hciconfig hci0 piscan
hciconfig hci0 noencrypt
bluealsa -S -p hfp-ofono -p a2dp-sink &
This is our BlueALSA configuration file.
The following steps are performed during testing:
- connect to the phone (HFP+A2DP), in this case a OnePlus 8T
- initiate a call
- listen to the ringing tone via the ALSA device to see if it sounds "sped up"
- terminate the call
- disconnect the phone
- repeat
It often requires ~100 iterations of this test procedure to reproduce the issue.
Setup
GNU/Linux (Buildroot) Linux 5.4.81 BlueALSA 3.1.0 BlueZ 5.58 ALSA 1.2.4
BlueALSA has been configured as follows:
## ----------- ##
## confdefs.h. ##
## ----------- ##
/* confdefs.h */
#define PACKAGE_NAME "BlueALSA"
#define PACKAGE_TARNAME "bluez-alsa"
#define PACKAGE_VERSION "v3.1.0"
#define PACKAGE_STRING "BlueALSA v3.1.0"
#define PACKAGE_BUGREPORT "[email protected]"
#define PACKAGE_URL "https://github.com/Arkq/bluez-alsa"
#define PACKAGE "bluez-alsa"
#define VERSION "v3.1.0"
#define STDC_HEADERS 1
#define HAVE_SYS_TYPES_H 1
#define HAVE_SYS_STAT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRING_H 1
#define HAVE_MEMORY_H 1
#define HAVE_STRINGS_H 1
#define HAVE_INTTYPES_H 1
#define HAVE_STDINT_H 1
#define HAVE_UNISTD_H 1
#define __EXTENSIONS__ 1
#define _ALL_SOURCE 1
#define _GNU_SOURCE 1
#define _POSIX_PTHREAD_SEMANTICS 1
#define _TANDEM_SOURCE 1
#define HAVE_DLFCN_H 1
#define LT_OBJDIR ".libs/"
#define HAVE_LIBSEGFAULT 1
#define DEBUG 1
#define HAVE_EXECINFO_H 1
#define HAVE_STDATOMIC_H 1
#define HAVE_EVENTFD 1
#define HAVE_SPLICE 1
#define ENABLE_AAC 1
#define ENABLE_MSBC 1
#define ENABLE_OFONO 1
#define ENABLE_PAYLOADCHECK 1
#define ALSA_CONF_DIR "/etc/alsa/conf.d"
#define ALSA_PLUGIN_DIR "/usr/lib/alsa-lib"
Additional context
Here are two log files with connect/dial/hangup test runs:
- "good" call (audio quality as expected)
- "bad" call (audio sounds as if it is sped up 2-3x)
reading sped-up
The rate that audio is played is determined by the playback device, not the source (bluealsa) device.
Perhaps audio frames are being dropped within the bluealsa server or bluetooth driver which would cause the remaining frames to create a "too fast" sound; but then the playback device would also suffer regular underruns because it is consuming frames faster than they are being produced. There should also be warning messages in the bluealsa log because mSBC frames include a sequence number and so a missing frame is easy to detect. Your report does not mention playback underruns, and there are no missing frame warnings in the log you supplied. So at present this is a mystery.
@borine Thanks so much for taking a look at this.
The rate that audio is played is determined by the playback device, not the source (bluealsa) device.
Yes, true. To rule out the playback device itself, I also streamed raw samples from the input device into a file and could basically see that everything was compressed by a factor of 2-3 with respect to time. Playback in Audacity made it sound like a typical too-high sample rate. Our usual playback processing then fills the gaps with silence so that the result is essentially a high-pitched robot voice coming from our device.
...There should also be warning messages in the bluealsa log because mSBC frames include a sequence number and so a missing frame is easy to detect. Your report does not mention playback underruns, and there are no missing frame warnings in the log you supplied. So at present this is a mystery.
I also wondered the same thing, but there are no underruns being reported anywhere (not that I've seen, anyway). And unless I'm mistaken, -S should be giving me the most verbose log level.
Any ideas of additional things I could debug or places I could increase verbosity to get more clues?
streamed raw samples from the input device into a file and could basically see that everything was compressed by a factor of 2-3 with respect to time
I don't know what "everything was compressed" means. Can you explain what command line did you use to create the file, and what application did you use to "see" that "everything was compressed" ? Does the file contain approximately 16000 samples (ie 32000 bytes) per 1 second of recorded duration?
I think that either 50% of samples are being lost (that seems to me unlikely, but bluealsa would then write thousands of "missing frame" messages to syslog); or else the playback device is not playing the samples at the same rate as the capture device is producing them. I cannot imagine any other cause of the effect you are seeing.
I don't know what "everything was compressed" means. Can you explain what command line did you use to create the file, and what application did you use to "see" that "everything was compressed" ? Does the file contain approximately 16000 samples (ie 32000 bytes) per 1 second of recorded duration?
Sorry, I guess my wording was a little weird here. I meant "compressed" along the time axis, so I could basically see that the waveform had been shortened or squeezed together with regard to time (thus playback "sped up").
For testing, I was reading directly from the ALSA device in our C++ application (via snd_pcm_readi) and writing the contents to a file. 99 times out of 100, the raw audio data written to the file is exactly as expected. It sounds fine, and viewing the wave form in Audacity results in no surprises. But when the error occurs, it is obvious in Audacity that the waveform is "shorter"/"sped up", which playback confirms.
If it would help, I could try to dump the raw data to files again for the purposes of comparison.
the waveform had been shortened
unfortunately that does not help to diagnose the underlying problem. The bluealsa plugin does not create any waveform, it outputs a stream of PCM samples. The waveform you get from that depends on how the application interprets it. If the Bluetooth Profile is HFP using the mSBC codec, then the application needs to interpret the stream as a single channel of signed 16-bit samples to be played at 16000 samples per second. Now if the application configures the bluealsa pcm as something different from that, then ALSA will insert the plug plugin to convert from the codec format to whatever was configured by the application. So the actual samples in your "raw audio data" file will not necessarily be the same as the samples output by the bluealsa daemon, but they should approximate to the same waveform if interpreted correctly.
So, if the "waveform" you are seeing is not the one expected, then there may be several underlying problems:
- the application is configuring the bluealsa pcm differently to its output pcm (in this case, perhaps the input is configured as 1 channel, but the application generating the waveform treats it as if it is 2 channels; or maybe 16000 rate is interpreted as 32000 or 48000 rate. Either of these would make the waveform appear "compressed" in the way you describe.
- the bluetooth system is "losing" 50% of the samples, with the losses regularly spaced throughout the stream so that the speech is still recognizable but "speeded up"
- bluealsa is reporting the rate to be 16000 or the channels to be 2, but in reality the stream sent by the remote device is only 8000 or 1 channel; or equivalently the application is ignoring the values reported by bluealsa and assuming the wrong rate or channels.
As to why this is only happening in 1 run out of 100, the only suggestion I can make is that there is a race somewhere in the system. I do not remember any similar reports for BlueALSA 3.1.0 (although that was 4 years ago, so I may have forgotten), and there are no current issues that appear comparable.
You can see the bluetooth audio parameters using
bluealsa-cli info <PCM-PATH>
or
bluealsa-aplay -L
Next time you get the "compressed" output, run one of those commands while the stream is still running just to check that you really are receiving mSBC 16000Hz 1channel from the remote device and use whatever debugging is available in your application to see how it has configured its output device (and what conversions it is making internally, if any).