rfxgen icon indicating copy to clipboard operation
rfxgen copied to clipboard

Random crash on macOS (arm64)

Open raysan5 opened this issue 1 year ago • 15 comments

It seems there could be some issue with miniaudio (or maybe CoreAudio) on macOS Ventura (arm64).

It's a random issue that crashes the program.

Here the crash trace:

Process 5979 launched: '/Users/oskar/Desktop/rfxgen.app/Contents/MacOS/rfxgen' (arm64)
2023-09-19 20:14:21.663618+0200 rfxgen[5979:514122] [Window] Warning: Window GLFWWindow 0x1004058d0 ordered front from a non-active application and may order beneath the active application's windows.
2023-09-19 20:14:21.665867+0200 rfxgen[5979:514122] [Window] Warning: Window GLFWWindow 0x1004058d0 ordered front from a non-active application and may order beneath the active application's windows.
2023-09-19 20:14:21.701028+0200 rfxgen[5979:514122] [plugin] AddInstanceForFactory: No factory registered for id <CFUUID 0x600000204dc0> F8BB1C28-BAE8-11D6-9C31-00039315CD46
Process 5979 stopped
* thread #8, name = 'com.apple.audio.IOThread.client', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001000a3688 rfxgen`ma_linear_resampler_process_pcm_frames_f32_upsample + 352
rfxgen`ma_linear_resampler_process_pcm_frames_f32_upsample:
->  0x1000a3688 <+352>: ldr    s2, [x12, x9]
    0x1000a368c <+356>: ldr    s3, [x22, x9]
    0x1000a3690 <+360>: fmul   s3, s1, s3
    0x1000a3694 <+364>: fmul   s2, s0, s2
(lldb) bt
* thread #8, name = 'com.apple.audio.IOThread.client', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001000a3688 rfxgen`ma_linear_resampler_process_pcm_frames_f32_upsample + 352
    frame #1: 0x0000000100075fa4 rfxgen`ma_linear_resampler_process_pcm_frames_f32 + 40
    frame #2: 0x0000000100075f38 rfxgen`ma_linear_resampler_process_pcm_frames + 56
    frame #3: 0x00000001000a390c rfxgen`ma_resampling_backend_process__linear + 32
    frame #4: 0x00000001000764a8 rfxgen`ma_resampler_process_pcm_frames + 80
    frame #5: 0x0000000100078924 rfxgen`ma_data_converter_process_pcm_frames__resample_only + 48
    frame #6: 0x0000000100078420 rfxgen`ma_data_converter_process_pcm_frames + 100
    frame #7: 0x00000001000be34c rfxgen`ReadAudioBufferFramesInMixingFormat + 224
    frame #8: 0x0000000100099f00 rfxgen`OnSendAudioDataToDevice + 224
    frame #9: 0x00000001000a2958 rfxgen`ma_device__on_data_inner + 104
    frame #10: 0x00000001000a28a8 rfxgen`ma_device__on_data + 500
    frame #11: 0x00000001000a261c rfxgen`ma_device__handle_data_callback + 280
    frame #12: 0x000000010006b030 rfxgen`ma_device__read_frames_from_client + 116
    frame #13: 0x000000010006aac0 rfxgen`ma_device_handle_backend_data_callback + 192
    frame #14: 0x00000001000a0390 rfxgen`ma_on_output__coreaudio + 200
    frame #15: 0x000000013480a59c
    frame #16: 0x00000001349065ac
    frame #17: 0x000000013480e89c
    frame #18: 0x00000001a6c8294c
    frame #19: 0x00000001a6c8088c
    frame #20: 0x00000001a6de3574
    frame #21: 0x00000001a49a7fa8
(lldb)

@mackron any idea about this issue?

raysan5 avatar Sep 19 '23 19:09 raysan5

This is a strange one. I'm not entirely sure what's going on here. So you've not had any reports from your normal raylib users about this? And it's not happening on other platforms? And the crash is seemingly random?

The error is EXC_BAD_ACCESS which indicates some kind of erroneous memory access, but from I can see in the raylib code and this call stack it looks like the input and output buffers in question are allocated on the stack by raylib, which indicates to me that it's unlikely to be a backend related issue. But then it seems very strange that your users wouldn't have reported this a long time ago. I'm at a bit of a loss on that.

At the time of the crash there's a couple of variables called inputFramesProcessedThisIteration and outputFramesProcessedThisIteration (both in ReadAudioBufferFramesInMixingFormat()). Is it by chance possible to get the values of these variables at the time of the crash? That'll tell us whether or not it's something simple like trying to read beyond the input and/or output buffers.

One thing that popped into my head is maybe there's some kind of data alignment error that arm64 doesn't like? I don't have a huge amount of experience with it so I'm not sure that idea is just me being stupid.

mackron avatar Sep 20 '23 01:09 mackron

(original crash reporter here)

I'm not that familiar with LLDB but wouldn't BAD_ACCESS with address=0x0 imply a simple NULL dereference?

It is not easily reproducible but I can try again from a source build and see.

oskarnp avatar Sep 20 '23 08:09 oskarnp

One thing that popped into my head is maybe there's some kind of data alignment error that arm64 doesn't like? I don't have a huge amount of experience with it so I'm not sure that idea is just me being stupid.

Data alignment errors can certainly be an issue on some ARM platforms but I don't think Apple M1 running macOS is one of them. Modern ARMs can handle it just fine; it more depends what kernel is configured to do. However, when using C or C++ it is undefined behaviour to access unaligned memory unless you tell compiler about it somehow. __attribute__((packed)) etc.

But anyway. I'll look into this crash some more when I have some time.

oskarnp avatar Sep 20 '23 08:09 oskarnp

@oskarnp In case you try to compile it, please note that an VS2022 project is provided in projects directory, also CMake (despite not tested by me). And you also have a plain Makefile in src directory. You should call the Make with:

make PLATFORM=PLATFORM_DESKTOP PROJECT_NAME=rfxgen PROJECT_SOURCE_FILES=rfxgen.c -B

raysan5 avatar Sep 20 '23 09:09 raysan5

I did a quick test using CMake and generated an Xcode project. I got what is maybe a different crash this time. And here definitely a NULL pointer:

Screenshot 2023-09-20 at 12 26 11

But the question is why that entire struct is zero of course.

Here are the values asked for at the time of this crash:

inputFramesToProcessThisIteration	480
outputFramesProcessedThisIteration	480

Some more values if that helps (pResampler->lpf):

Screenshot 2023-09-20 at 12 38 05

oskarnp avatar Sep 20 '23 10:09 oskarnp

Thanks. This looks suspicious - the ma_lpf1 object should certainly not be all zero. Also, I can see with your break point that the local variables, which are declared prior to the loop, contain valid values. It almost looks like at some point the ma_lpf1 object has been either corrupted or uninitialized somehow.

Is it possible to go up one level and show us the state of the ma_linear_resampler object at the time of the crash?

@raysan5 Does rfxgen do a full device init/reinit cycle each time you play a new sound, or does it just keep the audio device running for the life of the program?

mackron avatar Sep 20 '23 21:09 mackron

Does rfxgen do a full device init/reinit cycle each time you play a new sound, or does it just keep the audio device running for the life of the program?

No, it calls InitAudioDevice() at init and CloseAudioDevice() at closing. It regenerates the sound when a button is pressed or a slide is modified (https://github.com/raysan5/rfxgen/blob/master/src/rfxgen.c#L583).

Here the implementation for sound loading:

// Load sound from wave data
// NOTE: Wave data must be unallocated manually
Sound LoadSoundFromWave(Wave wave)
{
    Sound sound = { 0 };

    if (wave.data != NULL)
    {
        // When using miniaudio we need to do our own mixing.
        // To simplify this we need convert the format of each sound to be consistent with
        // the format used to open the playback AUDIO.System.device. We can do this two ways:
        //
        //   1) Convert the whole sound in one go at load time (here).
        //   2) Convert the audio data in chunks at mixing time.
        //
        // First option has been selected, format conversion is done on the loading stage.
        // The downside is that it uses more memory if the original sound is u8 or s16.
        ma_format formatIn = ((wave.sampleSize == 8)? ma_format_u8 : ((wave.sampleSize == 16)? ma_format_s16 : ma_format_f32));
        ma_uint32 frameCountIn = wave.frameCount;

        ma_uint32 frameCount = (ma_uint32)ma_convert_frames(NULL, 0, AUDIO_DEVICE_FORMAT, AUDIO_DEVICE_CHANNELS, AUDIO.System.device.sampleRate, NULL, frameCountIn, formatIn, wave.channels, wave.sampleRate);
        if (frameCount == 0) TRACELOG(LOG_WARNING, "SOUND: Failed to get frame count for format conversion");

        AudioBuffer *audioBuffer = LoadAudioBuffer(AUDIO_DEVICE_FORMAT, AUDIO_DEVICE_CHANNELS, AUDIO.System.device.sampleRate, frameCount, AUDIO_BUFFER_USAGE_STATIC);
        if (audioBuffer == NULL)
        {
            TRACELOG(LOG_WARNING, "SOUND: Failed to create buffer");
            return sound; // early return to avoid dereferencing the audioBuffer null pointer
        }

        frameCount = (ma_uint32)ma_convert_frames(audioBuffer->data, frameCount, AUDIO_DEVICE_FORMAT, AUDIO_DEVICE_CHANNELS, AUDIO.System.device.sampleRate, wave.data, frameCountIn, formatIn, wave.channels, wave.sampleRate);
        if (frameCount == 0) TRACELOG(LOG_WARNING, "SOUND: Failed format conversion");

        sound.frameCount = frameCount;
        sound.stream.sampleRate = AUDIO.System.device.sampleRate;
        sound.stream.sampleSize = 32;
        sound.stream.channels = AUDIO_DEVICE_CHANNELS;
        sound.stream.buffer = audioBuffer;
    }

    return sound;
}

raysan5 avatar Sep 20 '23 21:09 raysan5

OK, thanks. I was just vaguely wondering if maybe there was some kind of syncing error where the device was being pulled out from under itself and corrupting something. I'm at a bit of a loss with this one so far.

mackron avatar Sep 20 '23 21:09 mackron

@mackron Not sure if it can be somewhat related but recently some change was done with mutex at initialization...

Also note that raylib is not using latest miniaudio, it's using v0.11.16

raysan5 avatar Sep 20 '23 23:09 raysan5

OK it might actually be worth updating that. This code is related to the resampler, and I did make this change in 0.11.18:

* Fix erroneous output with the linear resampler when in/out rates are the same.

And in the breakpoint screenshot above, I can indeed see that the sample rate is 1, which means the in/out rate is the same in this particular case. So maybe worth updating to 0.11.18 just to eliminate that as a possibility? It should be API compatible.

mackron avatar Sep 20 '23 23:09 mackron

I sat down to try reproduce it again now but got this instead:

rfxgen(17153,0x1ffa2a080) malloc: Heap corruption detected, free list is damaged at 0x600002c0adc0
*** Incorrect guard value: 4462942231035088464
rfxgen(17153,0x1ffa2a080) malloc: *** set a breakpoint in malloc_error_break to debug
rfxgen(17153,0x1ffa2a080) malloc: Heap corruption detected, free list is damaged at 0x600002c0adc0
*** Incorrect guard value: 4462942231035088464

So yeah... corruption.

It should be relatively easy to find with address sanitizer, but I think at that point it is not really macOS or ARM specific anymore.

oskarnp avatar Sep 21 '23 04:09 oskarnp

Is it possible for you to update miniaudio.h to the version currently in it's master branch and see if that changes anything? It should just be a matter of dropping it in and recompiling. That's in raylib - I'm not sure how the rfxgen build system is set up. https://github.com/mackron/miniaudio/blob/master/miniaudio.h

mackron avatar Sep 21 '23 21:09 mackron

@mackron If using rFXGen Continuous Integration system, it automatically syncs and compiles latest raylib from master branch.

Just updated raylib miniaudio to v0.11.18.

raysan5 avatar Sep 21 '23 21:09 raysan5

Unfortunately I got the same crash with malloc heap corruption with latest version. But that could be caused by anything, not necessarily miniaudio.


NOTE: If anyone uses the CMake build system, keep in mind that it automatically downloads raylib as part of the build. To have it use a different version of raylib then "FindRaylib.cmake" needs to edited:

diff --git a/projects/CMake/cmake/FindRaylib.cmake b/projects/CMake/cmake/FindRaylib.cmake
index efe290b..98f0c04 100644
--- a/projects/CMake/cmake/FindRaylib.cmake
+++ b/projects/CMake/cmake/FindRaylib.cmake
@@ -1,10 +1,10 @@
-find_package(raylib 4.2.0 QUIET CONFIG)
+find_package(raylib 4.5.0 QUIET CONFIG)
 if (NOT raylib_FOUND)
     include(FetchContent)
     FetchContent_Declare(
         raylib
         GIT_REPOSITORY https://github.com/raysan5/raylib.git
-        GIT_TAG cb085a1b50324315ec77f134be3447107c52cf2d
+        GIT_TAG 477f5e5436e57f4dc4cadd4b8a2c9d78ff3d0c0e
     )
     FetchContent_GetProperties(raylib)
     if (NOT raylib_POPULATED) # Have we downloaded raylib yet?

Since I don't have raylib installed on my system the first line fails and it falls back to fetching a specific git commit instead.

oskarnp avatar Sep 22 '23 05:09 oskarnp

@oskarnp @mackron I enabled Visual Studio Address Sanitizer and tried to run rFXGen, I got the following output:

INFO: Initializing raylib 4.6-dev
INFO: Supported raylib modules:
INFO:     > rcore:..... loaded (mandatory)
INFO:     > rlgl:...... loaded (mandatory)
INFO:     > rshapes:... loaded (optional)
INFO:     > rtextures:. loaded (optional)
INFO:     > rtext:..... loaded (optional)
INFO:     > rmodels:... not loaded (optional)
INFO:     > raudio:.... loaded (optional)
INFO: DISPLAY: Device initialized successfully
INFO:     > Display size: 1920 x 1080
INFO:     > Screen size:  540 x 580
INFO:     > Render size:  540 x 580
INFO:     > Viewport offsets: 0, 0
INFO: GLAD: OpenGL extensions loaded successfully
INFO: GL: Supported extensions count: 241
INFO: GL: OpenGL device information:
INFO:     > Vendor:   Intel
INFO:     > Renderer: Intel(R) Iris(R) Xe Graphics
INFO:     > Version:  3.3.0 - Build 30.0.100.9864
INFO:     > GLSL:     3.30 - Build 30.0.100.9864
INFO: GL: VAO extension detected, VAO functions loaded successfully
INFO: GL: NPOT textures extension detected, full NPOT textures supported
INFO: GL: DXT compressed textures supported
INFO: GL: ETC2/EAC compressed textures supported
INFO: TEXTURE: [ID 1] Texture loaded successfully (1x1 | R8G8B8A8 | 1 mipmaps)
INFO: TEXTURE: [ID 1] Default texture loaded successfully
INFO: SHADER: [ID 1] Vertex shader compiled successfully
INFO: SHADER: [ID 2] Fragment shader compiled successfully
INFO: SHADER: [ID 3] Program shader loaded successfully
INFO: SHADER: [ID 3] Default shader loaded successfully
INFO: RLGL: Render batch vertex buffers loaded successfully in RAM (CPU)
INFO: RLGL: Render batch vertex buffers loaded successfully in VRAM (GPU)
INFO: RLGL: Default OpenGL state initialized successfully
INFO: TEXTURE: [ID 2] Texture loaded successfully (128x128 | GRAY_ALPHA | 1 mipmaps)
INFO: FONT: Default font loaded successfully (224 glyphs)
INFO: AUDIO: Device initialized successfully
INFO:     > Backend:       miniaudio / WASAPI
INFO:     > Format:        32-bit IEEE Floating Point -> 32-bit IEEE Floating Point
INFO:     > Channels:      2 -> 2
INFO:     > Sample rate:   48000 -> 48000
INFO:     > Periods size:  1440
INFO: TEXTURE: [ID 3] Texture loaded successfully (1032x128 | R8G8B8A8 | 1 mipmaps)
INFO: TEXTURE: [ID 1] Depth renderbuffer loaded successfully (32 bits)
INFO: FBO: [ID 1] Framebuffer object created successfully
INFO: TEXTURE: [ID 4] Texture loaded successfully (540x580 | R8G8B8A8 | 1 mipmaps)
INFO: TEXTURE: [ID 2] Depth renderbuffer loaded successfully (32 bits)
INFO: FBO: [ID 2] Framebuffer object created successfully
=================================================================
==22284==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7ff6a669f9ae at pc 0x7ffe300e18fb bp 0x009b02cf9000 sp 0x009b02cf8790
READ of size 8 at 0x7ff6a669f9ae thread T0
==22284==WARNING: Failed to use and restart external symbolizer!
    #0 0x7ffe300e18fa in _asan_wrap_GlobalSize+0x4e226 (C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\HostX64\x64\clang_rt.asan_dynamic-x86_64.dll+0x1800518fa)
    #1 0x7ff6a62b7a5f in sinfl_read64 C:\GitHub\raylib\src\external\sinfl.h:183
    #2 0x7ff6a62b7f8a in sinfl_refill C:\GitHub\raylib\src\external\sinfl.h:213
    #3 0x7ff6a62baf89 in sinfl_decompress C:\GitHub\raylib\src\external\sinfl.h:467
    #4 0x7ff6a62b76ba in sinflate C:\GitHub\raylib\src\external\sinfl.h:570
    #5 0x7ff6a6292510 in DecompressData C:\GitHub\raylib\src\rcore.c:3613
    #6 0x7ff6a62679f2 in GuiLoadStyleCyber C:\GitHub\rfxgen\src\styles\style_cyber.h:559
    #7 0x7ff6a626d18a in main C:\GitHub\rfxgen\src\rfxgen.c:414
    #8 0x7ff6a6586b78 in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
    #9 0x7ff6a6586acd in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #10 0x7ff6a658698d in __scrt_common_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:330
    #11 0x7ff6a6586bed in mainCRTStartup D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:16
    #12 0x7ffee5bc7343 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x180017343)
    #13 0x7ffee63a26b0 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x1800526b0)

0x7ff6a669f9ae is located 0 bytes to the right of global variable 'cyberFontData' defined in 'style_cyber.h:42:21' (0x7ff6a669f0c0) of size 2286
SUMMARY: AddressSanitizer: global-buffer-overflow (C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\HostX64\x64\clang_rt.asan_dynamic-x86_64.dll+0x1800518fa) in _asan_wrap_GlobalSize+0x4e226
Shadow bytes around the buggy address:
  0x120414753ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x120414753ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x120414753f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x120414753f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x120414753f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x120414753f30: 00 00 00 00 00[06]f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x120414753f40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x120414753f50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x120414753f60: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x120414753f70: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x120414753f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
Address Sanitizer Error: Global buffer overflow

As per that output there is some problem with DecompressData(), in sinflate() function in the DEFLATE library.

I've been investigating it for a couple of hours and I couldn't determine the reason of the crash, I'm opening an issue on raylib repo with some minimal code sample for further investigation.

raysan5 avatar Sep 25 '23 11:09 raysan5

This issue has been reviewed/fixed in https://github.com/raysan5/raylib/issues/3349

raysan5 avatar Mar 19 '24 09:03 raysan5