mumble
mumble copied to clipboard
Crash in Audio-Wizard
Describe the bug
Mumble crashes when opening Audio-Wizard and pressing "next".
Affects 1.4rc1 and later (tested on tagged 1.4rc1 and master
at the time of writing)
Steps to Reproduce Steps to reproduce the behavior:
- From the main menu: Configure → Audio Wizard
- Press next
- Application crashes
The issue can be reproduced with either ALSA or PulseAudio.
ALSA
<W>2021-11-10 11:35:14.960 AudioInput: Opus encoder set for high quality speech
<W>2021-11-10 11:35:14.960 AudioInput: 40000 bits/s, 48000 hz, 480 sample
<W>2021-11-10 11:35:14.960 ALSAAudioOutput: Initialized
<W>2021-11-10 11:35:14.960 ALSAAudioInput: Initing audiocapture default.
<W>2021-11-10 11:35:14.961 ALSAAudioInput: Actual buffer 48000 hz, 1 channel 3840 samples [480 per period]
<W>2021-11-10 11:35:14.961 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 0 channel 48000 hz echo
<W>2021-11-10 11:35:14.961 ALSAAudioInput: Releasing ALSA Mic.
<W>2021-11-10 11:35:14.962 ALSAAudioOutput: ALSA reports 32 output channels. Clamping to 2.
<W>2021-11-10 11:35:14.962 ALSAAudioOutput: Actual buffer 48000 hz, 2 channel 2881 samples [480 per period]
<W>2021-11-10 11:35:14.962 ALSAAudioOutput: Initializing 2 channel, 48000 hz mixer
<W>2021-11-10 11:35:14.962 AudioOutput: Initialized 2 channel 48000 hz mixer
<W>2021-11-10 11:35:14.963 ALSAAudioOutput: Destroyed
<W>2021-11-10 11:35:14.963 AudioInput: Opus encoder set for high quality speech
<W>2021-11-10 11:35:14.963 AudioInput: 40000 bits/s, 48000 hz, 480 sample
<W>2021-11-10 11:35:14.963 ALSAAudioOutput: Initialized
<W>2021-11-10 11:35:14.963 ALSAAudioInput: Initing audiocapture default.
<W>2021-11-10 11:35:14.963 ALSAAudioInput: Actual buffer 48000 hz, 1 channel 3840 samples [480 per period]
<W>2021-11-10 11:35:14.963 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 0 channel 48000 hz echo
<W>2021-11-10 11:35:14.964 ALSAAudioOutput: ALSA reports 32 output channels. Clamping to 2.
<W>2021-11-10 11:35:14.964 ALSAAudioOutput: Actual buffer 48000 hz, 2 channel 2881 samples [480 per period]
<W>2021-11-10 11:35:14.966 ALSAAudioOutput: Initializing 2 channel, 48000 hz mixer
<W>2021-11-10 11:35:14.966 AudioOutput: Initialized 2 channel 48000 hz mixer
<W>2021-11-10 11:35:14.974 AudioInput: Noise canceller disabled
warning: The VAD has been replaced by a hack pending a complete rewrite
Fatal (internal) error in /home/user/code/mumble/3rdparty/opus/celt/celt_decoder.c, line 118: assertion failed: st->mode == opus_custom_mode_create(48000, 960, NULL)
Aborted (core dumped)
Pulse
<W>2021-11-10 11:34:40.925 AudioInput: Opus encoder set for high quality speech
<W>2021-11-10 11:34:40.925 AudioInput: 40000 bits/s, 48000 hz, 480 sample
<W>2021-11-10 11:34:40.925 PulseAudio: Starting input alsa_input.usb-Solid_State_Logic_SSL_2-00.analog-stereo
<W>2021-11-10 11:34:40.925 PulseAudio: Starting output: alsa_output.usb-Solid_State_Logic_SSL_2-00.analog-stereo
<W>2021-11-10 11:34:40.926 AudioOutput: Initialized 2 channel 48000 hz mixer
<W>2021-11-10 11:34:40.926 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 0 channel 48000 hz echo
<W>2021-11-10 11:34:40.934 Updating application palette
<W>2021-11-10 11:34:40.936 AudioInput: Noise canceller disabled
warning: The VAD has been replaced by a hack pending a complete rewrite
<W>2021-11-10 11:34:45.569 AudioInput: Opus encoder set for high quality speech
<W>2021-11-10 11:34:45.569 AudioInput: 40000 bits/s, 48000 hz, 480 sample
<W>2021-11-10 11:34:45.569 AudioInput: Opus encoder set for high quality speech
<W>2021-11-10 11:34:45.569 AudioInput: 40000 bits/s, 48000 hz, 480 sample
<W>2021-11-10 11:34:45.571 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 0 channel 48000 hz echo
<W>2021-11-10 11:34:45.576 AudioInput: Noise canceller disabled
warning: The VAD has been replaced by a hack pending a complete rewrite
<W>2021-11-10 11:34:46.577 AudioOutput: Initialized 2 channel 48000 hz mixer
Fatal (internal) error in /home/user/code/mumble/3rdparty/opus/celt/celt_decoder.c, line 118: assertion failed: st->mode == opus_custom_mode_create(48000, 960, NULL)
Aborted (core dumped)
Expected behavior No crash.
Screenshots N/A
Desktop (please complete the following information):
- OS: Debian 11
- Version: 1.4rc1++
Additional context
-
There is only one audio device connected to the system. In-build audio was specifically disabled.
-
Build flags:
cmake -DRELEASE_ID=$(python "../scripts/mumble-version.py") -Dserver=OFF -Doverlay=OFF -Dcoreaudio=OFF -Ddbus=OFF -Dbundle-qt-translations=OFF -Dasio=OFF -Dgkey=OFF -Dg15=OFF -Djackaudio=OFF -Doss=OFF -Dpipewire=OFF -Dportaudio=OFF -Dsymbols=OFF -Dwasapi=OFF -Dxboxinput=OFF -Dxinput2=OFF -Dzeroconf=OFF -Dpulseaudio=ON -Dalsa=OFF -DCMAKE_BUILD_TYPE=Release
Working set of parameters for PA-only build:
cmake
-DCMAKE_BUILD_TYPE=Release
-DOPUS_STACK_PROTECTOR=ON
-DOPUS_X86_PRESUME_AVX=ON
-DOPUS_X86_PRESUME_SSE4_1=ON
-DRELEASE_ID=$(python "../scripts/mumble-version.py")
-Dalsa=OFF
-Dasio=OFF
-Dbundle-qt-translations=OFF
-Dcoreaudio=OFF
-Dcrash-report=OFF
-Ddbus=OFF
-Dg15=OFF
-Dgkey=OFF
-Djackaudio=OFF
-Doptimize=ON
-Doss=OFF
-Doverlay=OFF
-Dpipewire=OFF
-Dplugins=OFF
-Dportaudio=OFF
-Dpulseaudio=ON
-Dserver=OFF
-Dspeechd=OFF
-Dstatic=OFF
-Dsymbols=OFF
-Dtranslations=OFF
-Dupdate=OFF
-Dwasapi=OFF
-Dxboxinput=OFF
-Dxinput2=OFF
-Dzeroconf=OFF ..
Still trying to narrow-down the root cause of https://github.com/mumble-voip/mumble/issues/5300.
Why did you close this again? :thinking:
FWIW I can't reproduce this on my machine. However, I did not explicitly disable any audio devices (I don't know how that would work). With these compile options, Mumble was set to use PulseAudio (with no option to switch to anything else)
Why did you close this again? thinking
Cost/benefit. As noted in https://github.com/mumble-voip/mumble/issues/5302#issuecomment-964833310 the application works fine with ALSA removed from the equation.
I don't want to waste your time if this is something relevant solely to my combination of Linux distribution and/or hardware.
This happened with me too. Using -Dalsa=OFF
didn't help. The error points out this line in the opus repo.
Please don't close issues in haste when you got it working but the bug/problem has not been narrowed down.
Unrelated: Also, I can't see PipeWire
in Interface options even though the flag is ON by default.
@siddhpant do you happen to have a full backtrace that you could share with us?
@siddhpant do you happen to have a full backtrace that you could share with us? ~ @Krzmbrzl
Built using these commands:
$ cmake \
-DCMAKE_UNITY_BUILD=ON \
-Dcrash-report=OFF \
-Doptimize=ON \
-Dplugins=OFF \
-Dqtspeech=ON \
-Dtranslations=OFF \
-Dwasapi=OFF \
-Dxboxinput=OFF \
..
$ make
$ make install
Then started Mumble and ran the audio wizard, which expectedly crashes again on clicking Next
.
Terminal output:
<D>2021-11-13 14:53:37.265 libopus 1.3.1-97-g6b6035ae from /usr/local/lib/mumble/libopus.so.0
<W>2021-11-13 14:53:37.276 CELT bitstream 8000000b from /usr/local/lib/mumble/libcelt0.so.0.7.0
<D>2021-11-13 14:53:37.282 PulseAudio 15.0.0 from libpulse.so.0
ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:877:(find_matching_chmap) Found no matching channel map
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_oss.c:397:(_snd_pcm_oss_open) Cannot open device /dev/dsp
ALSA lib pcm_a52.c:835:(_snd_pcm_a52_open) a52 is only for playback
ALSA lib confmisc.c:160:(snd_config_get_card) Invalid field card
ALSA lib pcm_usb_stream.c:482:(_snd_pcm_usb_stream_open) Invalid card 'card'
ALSA lib confmisc.c:160:(snd_config_get_card) Invalid field card
ALSA lib pcm_usb_stream.c:482:(_snd_pcm_usb_stream_open) Invalid card 'card'
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
<D>2021-11-13 14:53:37.470 PortAudio V19.6.0-devel, revision 396fe4b6699ae929d3a685b3ef8a7e97396139a4 from libportaudio.so.2
<D>2021-11-13 14:53:37.471 JACK 1.9.19 from libjack.so.0
<W>2021-11-13 14:53:37.498 Theme: "Mumble"
<W>2021-11-13 14:53:37.498 Style: "Dark"
<W>2021-11-13 14:53:37.498 --> qss: ":themes/Default/Dark.qss"
<W>2021-11-13 14:53:37.499 Locale is "en_IN" (System: "en_IN")
<W>2021-11-13 14:53:37.499 Unable to find bundled translations for locale "en_IN"
<W>2021-11-13 14:53:37.677 Database SQLite: "3.36.0"
<W>2021-11-13 14:53:37.711 GlobalShortcutX: Using XI2 2.3
<W>2021-11-13 14:53:37.994 QMetaObject::connectSlotsByName: No matching signal for on_qtvUsers_customContextMenuRequested(QPoint,bool)
<W>2021-11-13 14:53:41.572 AudioInput: Opus encoder set for low delay
<W>2021-11-13 14:53:41.574 AudioInput: 192000 bits/s, 48000 hz, 480 sample
<W>2021-11-13 14:53:41.574 PulseAudio: Starting input alsa_input.pci-0000_00_1b.0.analog-stereo
<W>2021-11-13 14:53:41.574 PulseAudio: Starting echo: alsa_output.pci-0000_00_1b.0.analog-stereo.monitor
<W>2021-11-13 14:53:41.597 Public list disabled
<W>2021-11-13 14:53:41.636 Zeroconf: Third-party browser API reports error -65537
<W>2021-11-13 14:53:41.666 PulseAudio: Starting output: alsa_output.pci-0000_00_1b.0.analog-stereo
<W>2021-11-13 14:53:41.677 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 0 channel 48000 hz echo
<W>2021-11-13 14:53:41.682 AudioInput: Using RNNoise as noise canceller
warning: The VAD has been replaced by a hack pending a complete rewrite
<W>2021-11-13 14:53:41.695 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 1 channel 48000 hz echo
<W>2021-11-13 14:53:41.723 AudioOutput: Initialized 2 channel 48000 hz mixer
<W>2021-11-13 14:53:41.727 AudioInput: Using RNNoise as noise canceller
warning: The VAD has been replaced by a hack pending a complete rewrite
<W>2021-11-13 14:53:41.727 AudioInput: ECHO CANCELLER ACTIVE
<W>2021-11-13 14:53:42.100 Updating application palette
<W>2021-11-13 14:53:42.156 Updating application palette
<W>2021-11-13 14:53:45.421 AudioInput: Opus encoder set for low delay
<W>2021-11-13 14:53:45.421 AudioInput: 192000 bits/s, 48000 hz, 480 sample
<W>2021-11-13 14:53:45.426 AudioInput: Opus encoder set for low delay
<W>2021-11-13 14:53:45.426 AudioInput: 192000 bits/s, 48000 hz, 480 sample
<W>2021-11-13 14:53:45.427 AudioInput: Initialized mixer for 0 channel 48000 hz mic and 1 channel 48000 hz echo
<W>2021-11-13 14:53:45.427 AudioOutput: Initialized 2 channel 48000 hz mixer
<W>2021-11-13 14:53:45.431 AudioInput: Initialized mixer for 1 channel 48000 hz mic and 1 channel 48000 hz echo
<W>2021-11-13 14:53:45.454 AudioInput: Using RNNoise as noise canceller
warning: The VAD has been replaced by a hack pending a complete rewrite
<W>2021-11-13 14:53:45.454 AudioInput: ECHO CANCELLER ACTIVE
Fatal (internal) error in /tmp/mumble/3rdparty/opus/celt/celt_decoder.c, line 118: assertion failed: st->mode == opus_custom_mode_create(48000, 960, NULL)
Aborted
System details:
$ uname -a
Linux peesee 5.14.0-kali2-amd64 #1 SMP Debian 5.14.9-2kali1 (2021-10-04) x86_64 GNU/Linux
$ pactl info | grep "^Server Name"
Server Name: PulseAudio (on PipeWire 0.3.39)
Surprisingly, when I build a deb package and then install, the client works and the Audio Wizard has ran successfully.
Commands used to build:
$ release_id=1.5.0~2021-11-11~ge54798cbd~snapshot
$ version=1.5.0.0
$ cmake \
-DBUILD_NUMBER=0 \
-DRELEASE_ID=$release_id \
-DCMAKE_UNITY_BUILD=ON \
-Dalsa=OFF \
-Dcrash-report=OFF \
-Doptimize=ON \
-Dpackaging=ON \
-Dplugins=OFF \
-Dqtspeech=ON \
-Dtranslations=OFF \
-Dwasapi=OFF \
-Dxboxinput=OFF \
..
$ sudo checkinstall \
--install=yes \
--pkgname=mumble \
--pkgversion=$version \
--pkgrelease=$release_id
I get the same crash and the same output on: OpenSUSE Tumbleweed Linux 5.14.14-2-default
I'm on the master branch, latest commit.
Can you try building the package in the way I mentioned above?
For the reference, here is my build recipe:
git clone https://github.com/mumble-voip/mumble/
git submodule update --init --recursive
git pull --all --recurse-submodules
cd mumble && mkdir build && cd build
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DOPUS_STACK_PROTECTOR=ON \
-DOPUS_X86_PRESUME_AVX=ON \
-DOPUS_X86_PRESUME_SSE4_1=ON \
-DRELEASE_ID=$(python "../scripts/mumble-version.py") \
-Dalsa=OFF \
-Dasio=OFF \
-Dbundle-qt-translations=OFF \
-Dcoreaudio=OFF \
-Dcrash-report=OFF \
-Ddbus=OFF \
-Dg15=OFF \
-Dgkey=OFF \
-Djackaudio=OFF \
-Doptimize=ON \
-Doss=OFF \
-Doverlay=OFF \
-Dpipewire=OFF \
-Dplugins=OFF \
-Dportaudio=OFF \
-Dpulseaudio=ON \
-Dserver=OFF \
-Dspeechd=OFF \
-Dstatic=OFF \
-Dsymbols=OFF \
-Dtranslations=OFF \
-Dupdate=OFF \
-Dwasapi=OFF \
-Dxboxinput=OFF \
-Dxinput2=OFF \
-Dzeroconf=OFF ..
cmake --build . -j 8
This is the error I usually get after a crash:
Fatal (internal) error in /home/powerjungle/Desktop/mumble/3rdparty/opus/celt/celt_decoder.c, line 118: assertion failed: st->mode == opus_custom_mode_create(48000, 960, NULL)
I noticed this in the cmake logs.
-- The following features have been disabled:
* OPUS_CUSTOM_MODES, enable non-Opus modes, e.g. 44.1 kHz & 2^n frames.
So I decided to add this to my cmake command:
-DOPUS_CUSTOM_MODES=ON
and it stopped crashing for me at least for now.
I'm not sure what this mode is for, because I don't know how opus works, but this is where I'm at now.
I think it would be helpful to get an actual backfrace from a debugger like gdb.
@powerjungle @savchenko have you been connected to a server when using the audio wizard (and seeing the crash)?
This might be a race condition between regular Mumble receiving audio (using the Opus context object) and the audio wizard restarting the audio system (destroys the context object).
@powerjungle @savchenko have you been connected to a server when using the audio wizard (and seeing the crash)?
No even without being connected to a server, it just stops the moment audio has to be output to the speakers.
This might be a race condition between regular Mumble receiving audio (using the Opus context object) and the audio wizard restarting the audio system (destroys the context object).
I don't think it's a problem in the audio wizard only, it's every time audio is output anywhere.
Ah okay - too bad
https://github.com/mumble-voip/mumble/issues/5302#issuecomment-964833310
I disabled OPUS_X86_PRESUME_AVX, because my CPU doesn't support it, but it didn't help, same result.
I got a hint through IRC at Libera from the #opus channel that we might be dealing with memory corruption overwriting the mode. Although trying to run mumble using valgrind gives a segmentation fault. No idea why and how this can be fixed, I've tried running other apps, they don't fault with my version. I've tried different builds of Mumble as well running through it, same result. Runs fine on gdb, but I didn't get anywhere using it so far. When it crashes this is the backtrace:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff54b1323 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007ffff5461366 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff544a7d3 in __GI_abort () at abort.c:79
#4 0x00007fffeaf3418e in celt_fatal
(str=0x7fffeaf6d800 "assertion failed: st->mode == opus_custom_mode_create(48000, 960, NULL)", file=0x7fffeaf6d7b8 "/home/jungleman/Desktop/mumble/3rdparty/opus/celt/celt_decoder.c", line=118) at /home/jungleman/Desktop/mumble/3rdparty/opus/celt/arch.h:79
#5 0x00007fffeaf3f379 in validate_celt_decoder (st=0x7fffe40a1d10) at /home/jungleman/Desktop/mumble/3rdparty/opus/celt/celt_decoder.c:118
#6 0x00007fffeaf41a6a in celt_decode_with_ec
(st=0x7fffe40a1d10, data=0x7fffe4063c7b "\377\376", len=219, pcm=0x7fffe40acdd0, frame_size=960, dec=0x7fffebffc100, accum=0)
at /home/jungleman/Desktop/mumble/3rdparty/opus/celt/celt_decoder.c:873
#7 0x00007fffeaef99ef in opus_decode_frame (st=0x7fffe409fb20, data=0x7fffe4063c7b "\377\376", len=219, pcm=0x7fffe40acdd0, frame_size=960, decode_fec=0)
at /home/jungleman/Desktop/mumble/3rdparty/opus/src/opus_decoder.c:518
#8 0x00007fffeaefa7b5 in opus_decode_native
(st=0x7fffe409fb20, data=0x7fffe4063c7b "\377\376", len=442, pcm=0x7fffe40acdd0, frame_size=5760, decode_fec=0, self_delimited=0, packet_offset=0x0, soft_clip=0) at /home/jungleman/Desktop/mumble/3rdparty/opus/src/opus_decoder.c:721
#9 0x00007fffeaefaaf9 in opus_decode_float
(st=0x7fffe409fb20, data=0x7fffe4063c78 "\373B\001\377\376", len=442, pcm=0x7fffe40acdd0, frame_size=5760, decode_fec=0)
at /home/jungleman/Desktop/mumble/3rdparty/opus/src/opus_decoder.c:828
#10 0x00000000004fa6af in AudioOutputSpeech::prepareSampleBuffer(unsigned int) (this=0x7fffe4061b70, frameCount=570)
at /home/jungleman/Desktop/mumble/src/mumble/AudioOutputSpeech.cpp:370
#11 0x00000000004f2e97 in AudioOutput::mix(void*, unsigned int) (this=0x13e6b40, outbuff=0x7fffebffd650, frameCount=570)
at /home/jungleman/Desktop/mumble/src/mumble/AudioOutput.cpp:385
#12 0x0000000000718f49 in PulseAudioSystem::write_callback(pa_stream*, unsigned long, void*) (s=0x7fffe400c980, bytes=4560, userdata=0xd20000)
at /home/jungleman/Desktop/mumble/src/mumble/PulseAudio.cpp:641
#13 0x00007ffff017586e in pa_command_request (pd=<optimized out>, command=<optimized out>, tag=<optimized out>, t=0x7fffe40761d0, userdata=0xd82520)
at ../src/pulse/stream.c:847
#14 0x00007ffff010173c in pa_pdispatch_run
(pd=0x7fffe40513a0, packet=packet@entry=0x7fffe4021fd0, ancil_data=ancil_data@entry=0x7fffe4005f58, userdata=userdata@entry=0xd82520)
at ../src/pulsecore/pdispatch.c:349
#15 0x00007ffff015a7af in pstream_packet_callback (p=<optimized out>, packet=0x7fffe4021fd0, ancil_data=0x7fffe4005f58, userdata=0xd82520)
at ../src/pulse/context.c:364
#16 0x00007ffff01065d7 in do_read (p=p@entry=0x7fffe4005cc0, re=re@entry=0x7fffe4005e88) at ../src/pulsecore/pstream.c:1023
#17 0x00007ffff0107318 in do_pstream_read_write (p=0x7fffe4005cc0) at ../src/pulsecore/pstream.c:254
#18 0x00007ffff01076d6 in srb_callback (srb=0x7fffe4021da0, userdata=0x7fffe4005cc0) at ../src/pulsecore/pstream.c:296
#19 0x00007ffff010abaa in srbchannel_rwloop (sr=0x7fffe4021da0) at ../src/pulsecore/srbchannel.c:190
#20 0x00007ffff016e013 in dispatch_pollfds (m=0xd54d20) at ../src/pulse/mainloop.c:676
#21 pa_mainloop_dispatch (m=m@entry=0xd54d20) at ../src/pulse/mainloop.c:917
#22 0x00007ffff016e666 in pa_mainloop_iterate (m=m@entry=0xd54d20, block=block@entry=1, retval=retval@entry=0x0) at ../src/pulse/mainloop.c:948
#23 0x00007ffff016e700 in pa_mainloop_run (m=0xd54d20, retval=0x0) at ../src/pulse/mainloop.c:963
#24 0x00007ffff017e5a9 in thread (userdata=0xd58e40) at ../src/pulse/thread-mainloop.c:101
#25 0x00007ffff011a53f in internal_thread_func (userdata=0xd574a0) at ../src/pulsecore/thread-posix.c:81
#26 0x00007ffff54af427 in start_thread (arg=<optimized out>) at pthread_create.c:435
#27 0x00007ffff5538810 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
I'm not familiar with any of this, but is it possible we need to build opus/celt with whatever defines the "CUSTOM_MODES" thing?
Someone linked to the opus/celt source location, but here it is for reference:
#if defined(ENABLE_HARDENING) || defined(ENABLE_ASSERTIONS)
/* Make basic checks on the CELT state to ensure we don't end
up writing all over memory. */
void validate_celt_decoder(CELTDecoder *st)
{
#ifndef CUSTOM_MODES
celt_assert(st->mode == opus_custom_mode_create(48000, 960, NULL));
celt_assert(st->overlap == 120);
celt_assert(st->end <= 21);
#else
/* From Section 4.3 in the spec: "The normal CELT layer uses 21 of those bands,
though Opus Custom (see Section 6.2) may use a different number of bands"
Check if it's within the maximum number of Bark frequency bands instead */
celt_assert(st->end <= 25);
#endif
celt_assert(st->channels == 1 || st->channels == 2);
celt_assert(st->stream_channels == 1 || st->stream_channels == 2);
celt_assert(st->downsample > 0);
celt_assert(st->start == 0 || st->start == 17);
celt_assert(st->start < st->end);
The main point here is that it shouldn't ever need these custom modes as we actually don't use them. My suspicion is that we are seeing the effect of some sort of memory/data corruption...
After a discussion with @Krzmbrzl and @davidebeatrici we came to the following conclusions:
- The crash is triggered when another user talks, not by sound simply playing.
- This issue is most likely caused by a long-standing, dormant bug that was unearthed by commit a4fdc46a80d7da23ba02fa08c3361503b5cdc33c, which replaces a custom Opus repo with the upstream one.
- It is not clear what the bug is as bisecting Opus is blocked by CMake support being added to Opus after the bug was introduced. It may be possible to backport the CMake patches and bisect past when they were added.
- Mumble works fine on Ubuntu 20.04, but fails on Debian Testing and Arch. All of these have Opus 1.3.1 as a system library.
- A working build can be produced despite encountering this bug by passing
-Dbundled-opus=OFF
to CMake, avoiding building Opus and instead using the system library. Alternatively, commit a4fdc46a80d7da23ba02fa08c3361503b5cdc33c can be reverted (git revert a4fdc46a
) before building. Take care to clean the build directory before trying this though as Opus.so
names differ and may cause issues.
Valgrind output at the time of crash:
==7576== Thread 33:
==7576== Invalid read of size 16
==7576== at 0x154575DC: memcpy (string_fortified.h:29)
==7576== by 0x154575DC: silk_decode_core (3rdparty/opus/silk/decode_core.c:235)
==7576== by 0x1545775D: silk_decode_frame (3rdparty/opus/silk/decode_frame.c:85)
==7576== by 0x154592E5: silk_Decode (3rdparty/opus/silk/dec_API.c:300)
==7576== by 0x154410A1: opus_decode_frame (3rdparty/opus/src/opus_decoder.c:402)
==7576== by 0x15440871: opus_decode_native (3rdparty/opus/src/opus_decoder.c:721)
==7576== by 0x154427A4: opus_decode_float (3rdparty/opus/src/opus_decoder.c:828)
==7576== by 0x4DD68A: AudioOutputSpeech::prepareSampleBuffer(unsigned int) (src/mumble/AudioOutputSpeech.cpp:370)
==7576== by 0x4D7F33: AudioOutput::mix(void*, unsigned int) (src/mumble/AudioOutput.cpp:385)
==7576== by 0x68857C: PipeWireOutput::processCallback(void*) (src/mumble/PipeWire.cpp:410)
==7576== by 0x13E8666A: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== by 0x30C58AA6: ??? (in /usr/lib/x86_64-linux-gnu/spa-0.2/audioconvert/libspa-audioconvert.so)
==7576== by 0x13E63580: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== Address 0x31cf3d20 is on thread 33's stack
==7576== 3904 bytes below stack pointer
==7576==
==7576== Invalid read of size 16
==7576== at 0x154575E0: memcpy (string_fortified.h:29)
==7576== by 0x154575E0: silk_decode_core (3rdparty/opus/silk/decode_core.c:235)
==7576== by 0x1545775D: silk_decode_frame (3rdparty/opus/silk/decode_frame.c:85)
==7576== by 0x154592E5: silk_Decode (3rdparty/opus/silk/dec_API.c:300)
==7576== by 0x154410A1: opus_decode_frame (3rdparty/opus/src/opus_decoder.c:402)
==7576== by 0x15440871: opus_decode_native (3rdparty/opus/src/opus_decoder.c:721)
==7576== by 0x154427A4: opus_decode_float (3rdparty/opus/src/opus_decoder.c:828)
==7576== by 0x4DD68A: AudioOutputSpeech::prepareSampleBuffer(unsigned int) (src/mumble/AudioOutputSpeech.cpp:370)
==7576== by 0x4D7F33: AudioOutput::mix(void*, unsigned int) (src/mumble/AudioOutput.cpp:385)
==7576== by 0x68857C: PipeWireOutput::processCallback(void*) (src/mumble/PipeWire.cpp:410)
==7576== by 0x13E8666A: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== by 0x30C58AA6: ??? (in /usr/lib/x86_64-linux-gnu/spa-0.2/audioconvert/libspa-audioconvert.so)
==7576== by 0x13E63580: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== Address 0x31cf3d30 is on thread 33's stack
==7576== 3888 bytes below stack pointer
==7576==
==7576== Invalid read of size 16
==7576== at 0x154575E5: memcpy (string_fortified.h:29)
==7576== by 0x154575E5: silk_decode_core (3rdparty/opus/silk/decode_core.c:235)
==7576== by 0x1545775D: silk_decode_frame (3rdparty/opus/silk/decode_frame.c:85)
==7576== by 0x154592E5: silk_Decode (3rdparty/opus/silk/dec_API.c:300)
==7576== by 0x154410A1: opus_decode_frame (3rdparty/opus/src/opus_decoder.c:402)
==7576== by 0x15440871: opus_decode_native (3rdparty/opus/src/opus_decoder.c:721)
==7576== by 0x154427A4: opus_decode_float (3rdparty/opus/src/opus_decoder.c:828)
==7576== by 0x4DD68A: AudioOutputSpeech::prepareSampleBuffer(unsigned int) (src/mumble/AudioOutputSpeech.cpp:370)
==7576== by 0x4D7F33: AudioOutput::mix(void*, unsigned int) (src/mumble/AudioOutput.cpp:385)
==7576== by 0x68857C: PipeWireOutput::processCallback(void*) (src/mumble/PipeWire.cpp:410)
==7576== by 0x13E8666A: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== by 0x30C58AA6: ??? (in /usr/lib/x86_64-linux-gnu/spa-0.2/audioconvert/libspa-audioconvert.so)
==7576== by 0x13E63580: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== Address 0x31cf3d40 is on thread 33's stack
==7576== 3872 bytes below stack pointer
==7576==
==7576== Invalid read of size 16
==7576== at 0x154575EA: memcpy (string_fortified.h:29)
==7576== by 0x154575EA: silk_decode_core (3rdparty/opus/silk/decode_core.c:235)
==7576== by 0x1545775D: silk_decode_frame (3rdparty/opus/silk/decode_frame.c:85)
==7576== by 0x154592E5: silk_Decode (3rdparty/opus/silk/dec_API.c:300)
==7576== by 0x154410A1: opus_decode_frame (3rdparty/opus/src/opus_decoder.c:402)
==7576== by 0x15440871: opus_decode_native (3rdparty/opus/src/opus_decoder.c:721)
==7576== by 0x154427A4: opus_decode_float (3rdparty/opus/src/opus_decoder.c:828)
==7576== by 0x4DD68A: AudioOutputSpeech::prepareSampleBuffer(unsigned int) (src/mumble/AudioOutputSpeech.cpp:370)
==7576== by 0x4D7F33: AudioOutput::mix(void*, unsigned int) (src/mumble/AudioOutput.cpp:385)
==7576== by 0x68857C: PipeWireOutput::processCallback(void*) (src/mumble/PipeWire.cpp:410)
==7576== by 0x13E8666A: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== by 0x30C58AA6: ??? (in /usr/lib/x86_64-linux-gnu/spa-0.2/audioconvert/libspa-audioconvert.so)
==7576== by 0x13E63580: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576== Address 0x31cf3d50 is on thread 33's stack
==7576== 3856 bytes below stack pointer
==7576==
Fatal (internal) error in 3rdparty/opus/celt/celt_decoder.c, line 118: assertion failed: st->mode == opus_custom_mode_create(48000, 960, NULL)
==7576==
==7576== Process terminating with default action of signal 6 (SIGABRT)
==7576== at 0x7181891: raise (raise.c:50)
==7576== by 0x716B535: abort (abort.c:79)
==7576== by 0x15474C1B: celt_fatal (3rdparty/opus/celt/arch.h:79)
==7576== by 0x1547E156: validate_celt_decoder (3rdparty/opus/celt/celt_decoder.c:118)
==7576== by 0x1547E481: celt_decode_with_ec (3rdparty/opus/celt/celt_decoder.c:873)
==7576== by 0x15441644: opus_decode_frame (3rdparty/opus/src/opus_decoder.c:518)
==7576== by 0x15440871: opus_decode_native (3rdparty/opus/src/opus_decoder.c:721)
==7576== by 0x154427A4: opus_decode_float (3rdparty/opus/src/opus_decoder.c:828)
==7576== by 0x4DD68A: AudioOutputSpeech::prepareSampleBuffer(unsigned int) (src/mumble/AudioOutputSpeech.cpp:370)
==7576== by 0x4D7F33: AudioOutput::mix(void*, unsigned int) (src/mumble/AudioOutput.cpp:385)
==7576== by 0x68857C: PipeWireOutput::processCallback(void*) (src/mumble/PipeWire.cpp:410)
==7576== by 0x13E8666A: ??? (in /usr/lib/x86_64-linux-gnu/libpipewire-0.3.so.0.345.0)
==7576==
==7576== HEAP SUMMARY:
==7576== in use at exit: 25,283,167 bytes in 132,219 blocks
==7576== total heap usage: 884,069 allocs, 751,850 frees, 310,362,964 bytes allocated
- Still fails with my Debian Stable system if Opus is included. How do we get Opus to fix their stuff (assuming we're not sending them invalid memory they eventually memcpy)? :)
- Building with opus disabled works.
I'm currently using this cmake line:
cmake .. -Doverlay-xcompile=OFF -Dbundled-opus=OFF
Valgrind output at the time of crash:
==7576== Thread 33: ==7576== Invalid read of size 16 ==7576== at 0x154575DC: memcpy (string_fortified.h:29) ==7576== by 0x154575DC: silk_decode_core (3rdparty/opus/silk/decode_core.c:235)
...
How do we get Opus to fix their stuff (assuming we're not sending them invalid memory they eventually memcpy)? :)
Well we'd have to figure out the exact cause of the issue. I still think it might be possible that IR is Mumble that somehow is causing a memory corruption :thinking:
Looking at the commits that were made to upstream Opus since we updated our submodule to point to their repo, the following commits catch my eyes:
- https://github.com/xiph/opus/commit/ec64b3c5b7abd621dfddee6b4cc115298e5d6803
- https://github.com/xiph/opus/commit/12a356e431d1b2d3531d3d73de330bf9ee9be48b
- https://github.com/xiph/opus/commit/ccb42e05cc6eb98a15874b9695361025b3ee17ab
If we're lucky, it is enough to update our Opus submodule to avoid these issues :eyes:
Could some of those who could reproduce the crash with the bundled Opus version please check out #5879 (and execute git submodule update --init --recursive
before building - which should note that it updated the Opus module) to see if incorporating these fixes is enough to fix the issue at hand?
@Krzmbrzl Sorry for the late response! I built your tree using the branch "update-opus". Ran mumble, and tested the audio wizard using ALSA and PulseAudio. Everything seems to work fine. No errors, no crashes so far.
Thanks :+1: In the meantime the "update" is actually a downgrade as Davide has already proven that the upgrade didn't fix the issue. It's good to hear that the downgrade fixes it, though.
Yeah, glad to know the root cause (not found yet) appears to be the same for everyone.
Hm.... I had slightly modified the now merged PR thinking that the code was still the same. However, now I can reliably reproduce this crash on the master branch. Could someone please re-test whether this issue exists on master
(potentially after having updated the submodule) or not?
So as it turns out that downgrade didn't fix this issue after all...