src/devices/sound/discrete.cpp bad address and threading issues
MAME version
0.255 mame0255-62-g9ff8931fa-50-dirty
System information
odroid n2l 6 core arm ubuntu 22.04 kern 5.15 en with osd sdl alsa via hdmi audio out
INI configuration details
all defaults except
audiodriver alsa
samplerate 48000
samples 1
waitvsync 1
syncrefresh 1
Emulated system/software
galaxian,mooncrst
Incorrect behaviour
Few moments into emulation, negative samples exception is thrown. Turning on debug, isnan exceptions also thrown. I suspect this is a threading/lock issue and use of invalid pointers (I am backed up by two fixme warnings re ptr math done before input list is actually initialized). I am trying to fix it, but I am not an expert (yet). (For example, I've forced number of samples to always have correct pointers, but it has side effects.)
Expected behaviour
No pointer math should result in invalid (out of bounds) access resulting in a segfault or underflow. The glitch is "audible" on x86 linux as a chirp on emu launch. Unless someone beats me to it, I will fix this.
Steps to reproduce
- launch mame (debug is good) with mooncrst on arm arch64 kern 5.15 smp preemp
- hear the first chirp after rom check (chirp is due to uninit data - infrequently isnan exception thrown)
- wait (usually within 5 minutes) and samples negative is thrown
Additional details
No response
So far, I am having fun replicating the issue. I've rebuilt a tiny version without optimization and full symbolics. This build does not crash - at least it ran 3 days without crashing. Its hard to debug when something doesn't crash. I'm going to test by rebuilding with optimization disabled (via pragma) to see if it makes a difference. I've had the crash with O2 or size. Haven't tried O1...
I found the bug. The pointers to the buffers were going invalid thanks to optimization. I fixed the bug. Anything const dynamic should be validated before use. This only appears to be an issue on arm. I am trying my fix on x86 now (to verify nothing breaks).
Issues needs to be closed after they are fixed in upstream, there's also an option that does it automatically once a PR gets merged under "Development" right pane.
This is not an osd sdl issue. The code is not thread safe "enough". https://github.com/mamedev/mame/blame/0636e54bb60192652efeeeb7732103b9134a31eb/src/devices/sound/discrete.cpp#L246
Sending a patch.
Patch sent, explanation of why I treat process as "critical section" https://www.octavianit.com/2024/02/fixing-galaxian-sound-on-arm-8-core.html
Patch sent, explanation of why I treat process as "critical section" https://www.octavianit.com/2024/02/fixing-galaxian-sound-on-arm-8-core.html
Consider the code where discrete_task::process is called from discrete_task::callback:
/* try to lock */
if (task->lock_threadid(threadid))
{
if (!task->process())
return nullptr;
task->unlock();
}
It locks the task to the current thread before it calls process (i.e. discrete_task::process is already treated as a critical section). If that’s working, process can only run on a single thread at a time for any given task. If that isn’t working, there’s an issue with discrete_task::lock_threadid that needs to be fixed properly.
Since it’s already guarding against running discrete_task::process on multiple threads concurrently for the same task, the mutex is redundant and just adds overhead.
Not sure why you reopened it; acquire instead of release appears to have fixed the issue. But thanks for your insight.
@seleuco tested it after #12034 was merged and can no longer reproduce the issue.
Not sure why you reopened it; acquire instead of release appears to have fixed the issue. But thanks for your insight.
It was automatically closed by GitHub because the issue was mentioned in the pull request description. There's no way to stop it from doing this.
I reopened it until and more testing was completed (by @balr0g and @seleuco).