obs-studio icon indicating copy to clipboard operation
obs-studio copied to clipboard

libobs: Add memory leak trace

Open exeldro opened this issue 1 year ago • 7 comments

Description

Add command-line --bmem-trace to enable keeping stacktraces for each bmalloc which includes bzalloc On shutdown the stack traces are printed in the OBS log file like this:

Number of memory leaks: 1 memory leak[0] 00000263DF6C2C80 memory leak trace[0]: bmalloc memory leak trace[1]: bzalloc memory leak trace[2]: obs_context_init_control memory leak trace[3]: obs_source_init memory leak trace[4]: obs_source_create_internal memory leak trace[5]: obs_source_create_private memory leak trace[6]: CanvasDock::CanvasDock memory leak trace[7]: obs_module_post_load memory leak trace[8]: obs_post_load_modules

It also does a simple memory overrun test for each allocation. It can only be enabled when there are no allocations, that is why obs_set_cmdline_args moved down.

Based on https://github.com/norihiro/obs-studio/tree/bmem-trace-memory-leak by @norihiro

Motivation and Context

Want an easy way to find the cause of memory leaks that users have that can not be easily replicated.

How Has This Been Tested?

On windows 64 bit

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

  • [x] My code has been run through clang-format.
  • [x] I have read the contributing document.
  • [x] My code is not on the master branch.
  • [x] The code has been tested.
  • [x] All commit messages are properly formatted and commits squashed where appropriate.
  • [x] I have included updates to all appropriate documentation.

exeldro avatar Jan 17 '24 10:01 exeldro

This being a runtime argument makes me wonder if there's a performance impact even on non-traced runs here. I'd imagine it could lead to less aggressive compiler optimisations, did you measure this?

gxalpha avatar Jan 17 '24 10:01 gxalpha

Tested performance with RelWithDebug build on windows 11 without a debugger attached on master, this pr non-traced and this pr traced.

Mainly looking at the audio and video thread stats as other things my pc is doing averages out on that. It looks to me like to small of a difference to tell from these numbers.

audio thread:

min median max 99th percentile
master 0.015 0.117 6.902 3.094
non-traced 0.015 0.117 21.701 2.982
traced 0.015 0.148 7.605 4.361

graphics thread:

min median max 99th percentile below 33.333 ms
master 0.097 2.892 1123.38 14.39 99.6283%
non-traced 0.104 2.969 1188.4 10.007 99.6552%
traced 0.102 3.606 1237.3 17.995 99.3322%
More of the logs here

Master:

12:09:50.166: == Profiler Results ============================= 12:09:50.166: run_program_init: 3078.6 ms 12:09:50.166: ┣OBSApp::AppInit: 16.561 ms 12:09:50.166: ┃ ┗OBSApp::InitLocale: 1.089 ms 12:09:50.166: ┗OBSApp::OBSInit: 3030.46 ms 12:09:50.166: ┣obs_startup: 11.814 ms 12:09:50.166: ┗OBSBasic::OBSInit: 2768.93 ms 12:09:50.166: ┣OBSBasic::InitBasicConfig: 0.375 ms 12:09:50.166: ┣OBSBasic::ResetAudio: 0.188 ms 12:09:50.166: ┣OBSBasic::ResetVideo: 186.755 ms 12:09:50.166: ┃ ┗obs_init_graphics: 184.736 ms 12:09:50.166: ┃ ┗shader compilation: 77.127 ms 12:09:50.166: ┣OBSBasic::InitOBSCallbacks: 0.007 ms 12:09:50.166: ┣OBSBasic::InitHotkeys: 0.022 ms 12:09:50.166: ┣obs_load_all_modules2: 514.14 ms ... 12:09:50.167: ┣OBSBasic::InitService: 1.347 ms 12:09:50.167: ┣OBSBasic::ResetOutputs: 0.444 ms 12:09:50.167: ┣OBSBasic::CreateHotkeys: 0.032 ms 12:09:50.167: ┣OBSBasic::InitPrimitives: 0.105 ms 12:09:50.167: ┗OBSBasic::Load: 1222.97 ms 12:09:50.167: obs_hotkey_thread(25 ms): min=0.004 ms, median=0.009 ms, max=0.586 ms, 99th percentile=0.317 ms, 100% below 25 ms 12:09:50.167: audio_thread(Audio): min=0.015 ms, median=0.117 ms, max=6.902 ms, 99th percentile=3.094 ms 12:09:50.167: obs_graphics_thread(33.3333 ms): min=0.097 ms, median=2.892 ms, max=1123.38 ms, 99th percentile=14.39 ms, 99.6283% below 33.333 ms 12:09:50.167: ┣tick_sources: min=0.001 ms, median=0.041 ms, max=1120.83 ms, 99th percentile=8.424 ms 12:09:50.167: ┣output_frame: min=0.09 ms, median=2.556 ms, max=17.759 ms, 99th percentile=6.286 ms 12:09:50.167: ┃ ┗gs_context(video->graphics): min=0.015 ms, median=0.132 ms, max=17.72 ms, 99th percentile=5.643 ms, 1.88662 calls per parent call 12:09:50.167: ┃ ┣render_video: min=0.002 ms, median=0.013 ms, max=17.336 ms, 99th percentile=5.393 ms 12:09:50.167: ┃ ┃ ┗render_main_texture: min=0.002 ms, median=0.011 ms, max=17.334 ms, 99th percentile=5.389 ms 12:09:50.167: ┃ ┗gs_flush: min=0.01 ms, median=0.105 ms, max=1.845 ms, 99th percentile=0.51 ms 12:09:50.167: ┗render_displays: min=0 ms, median=0.263 ms, max=3.768 ms, 99th percentile=0.786 ms 12:09:50.167: ================================================= 12:09:50.167: == Profiler Time Between Calls ================== 12:09:50.167: obs_hotkey_thread(25 ms): min=24.13 ms, median=25.37 ms, max=27.807 ms, 57.1429% within ±2% of 25 ms (0.393185% lower, 42.464% higher) 12:09:50.167: obs_graphics_thread(33.3333 ms): min=9.931 ms, median=33.333 ms, max=1123.4 ms, 99.2551% within ±2% of 33.333 ms (0.372439% lower, 0.372439% higher)

non-traced:

12:01:07.187: == Profiler Results ============================= 12:01:07.187: run_program_init: 4372.29 ms 12:01:07.187: ┣OBSApp::AppInit: 27.963 ms 12:01:07.187: ┃ ┗OBSApp::InitLocale: 7.929 ms 12:01:07.187: ┗OBSApp::OBSInit: 4314.32 ms 12:01:07.187: ┣obs_startup: 4.247 ms 12:01:07.187: ┗OBSBasic::OBSInit: 4076.81 ms 12:01:07.187: ┣OBSBasic::InitBasicConfig: 0.483 ms 12:01:07.187: ┣OBSBasic::ResetAudio: 0.187 ms 12:01:07.187: ┣OBSBasic::ResetVideo: 240.276 ms 12:01:07.187: ┃ ┗obs_init_graphics: 238.259 ms 12:01:07.187: ┃ ┗shader compilation: 123.483 ms 12:01:07.187: ┣OBSBasic::InitOBSCallbacks: 0.012 ms 12:01:07.187: ┣OBSBasic::InitHotkeys: 0.03 ms 12:01:07.187: ┣obs_load_all_modules2: 1714.06 ms ... 12:01:07.187: ┣OBSBasic::InitService: 1.473 ms 12:01:07.187: ┣OBSBasic::ResetOutputs: 0.47 ms 12:01:07.187: ┣OBSBasic::CreateHotkeys: 0.035 ms 12:01:07.187: ┣OBSBasic::InitPrimitives: 0.103 ms 12:01:07.187: ┗OBSBasic::Load: 1253.48 ms 12:01:07.187: obs_hotkey_thread(25 ms): min=0.003 ms, median=0.009 ms, max=0.95 ms, 99th percentile=0.245 ms, 100% below 25 ms 12:01:07.187: audio_thread(Audio): min=0.015 ms, median=0.117 ms, max=21.701 ms, 99th percentile=2.982 ms 12:01:07.187: obs_graphics_thread(33.3333 ms): min=0.104 ms, median=2.969 ms, max=1188.4 ms, 99th percentile=10.007 ms, 99.6552% below 33.333 ms 12:01:07.187: ┣tick_sources: min=0.001 ms, median=0.04 ms, max=1186.43 ms, 99th percentile=0.359 ms 12:01:07.187: ┣output_frame: min=0.091 ms, median=2.615 ms, max=25.674 ms, 99th percentile=6.764 ms 12:01:07.187: ┃ ┗gs_context(video->graphics): min=0.019 ms, median=0.124 ms, max=25.563 ms, 99th percentile=5.72 ms, 1.84828 calls per parent call 12:01:07.187: ┃ ┣render_video: min=0.003 ms, median=0.013 ms, max=24.99 ms, 99th percentile=5.439 ms 12:01:07.187: ┃ ┃ ┗render_main_texture: min=0.002 ms, median=0.01 ms, max=24.986 ms, 99th percentile=5.434 ms 12:01:07.187: ┃ ┗gs_flush: min=0.014 ms, median=0.103 ms, max=1.632 ms, 99th percentile=0.527 ms 12:01:07.187: ┗render_displays: min=0 ms, median=0.251 ms, max=7.321 ms, 99th percentile=0.868 ms 12:01:07.187: ================================================= 12:01:07.187: == Profiler Time Between Calls ================== 12:01:07.187: obs_hotkey_thread(25 ms): min=24.573 ms, median=25.326 ms, max=28.409 ms, 57.6456% within ±2% of 25 ms (0% lower, 42.3544% higher) 12:01:07.187: obs_graphics_thread(33.3333 ms): min=11.355 ms, median=33.333 ms, max=1188.41 ms, 99.3092% within ±2% of 33.333 ms (0.345423% lower, 0.345423% higher)

traced:

12:02:30.147: == Profiler Results ============================= 12:02:30.147: run_program_init: 3738.62 ms 12:02:30.147: ┣OBSApp::AppInit: 21.184 ms 12:02:30.147: ┃ ┗OBSApp::InitLocale: 5.285 ms 12:02:30.147: ┗OBSApp::OBSInit: 3686.18 ms 12:02:30.147: ┣obs_startup: 5.637 ms 12:02:30.147: ┗OBSBasic::OBSInit: 3418.15 ms 12:02:30.147: ┣OBSBasic::InitBasicConfig: 1.212 ms 12:02:30.147: ┣OBSBasic::ResetAudio: 0.176 ms 12:02:30.147: ┣OBSBasic::ResetVideo: 216.352 ms 12:02:30.147: ┃ ┗obs_init_graphics: 214.335 ms 12:02:30.147: ┃ ┗shader compilation: 104.593 ms 12:02:30.147: ┣OBSBasic::InitOBSCallbacks: 0.017 ms 12:02:30.147: ┣OBSBasic::InitHotkeys: 0.138 ms 12:02:30.147: ┣obs_load_all_modules2: 555.888 ms ... 12:02:30.147: ┣OBSBasic::InitService: 1.645 ms 12:02:30.147: ┣OBSBasic::ResetOutputs: 1.176 ms 12:02:30.147: ┣OBSBasic::CreateHotkeys: 0.17 ms 12:02:30.147: ┣OBSBasic::InitPrimitives: 0.13 ms 12:02:30.147: ┗OBSBasic::Load: 1354.54 ms 12:02:30.147: obs_hotkey_thread(25 ms): min=0.005 ms, median=0.012 ms, max=1.657 ms, 99th percentile=0.372 ms, 100% below 25 ms 12:02:30.147: audio_thread(Audio): min=0.015 ms, median=0.148 ms, max=7.605 ms, 99th percentile=4.361 ms 12:02:30.147: obs_graphics_thread(33.3333 ms): min=0.102 ms, median=3.606 ms, max=1237.3 ms, 99th percentile=17.995 ms, 99.3322% below 33.333 ms 12:02:30.147: ┣tick_sources: min=0.001 ms, median=0.052 ms, max=1235.21 ms, 99th percentile=6.633 ms 12:02:30.147: ┣output_frame: min=0.09 ms, median=3.193 ms, max=18.014 ms, 99th percentile=7.991 ms 12:02:30.147: ┃ ┗gs_context(video->graphics): min=0.027 ms, median=0.193 ms, max=17.951 ms, 99th percentile=6.894 ms, 1.90651 calls per parent call 12:02:30.147: ┃ ┣render_video: min=0.005 ms, median=0.019 ms, max=17.578 ms, 99th percentile=6.552 ms 12:02:30.147: ┃ ┃ ┗render_main_texture: min=0.003 ms, median=0.013 ms, max=17.574 ms, 99th percentile=6.545 ms 12:02:30.147: ┃ ┗gs_flush: min=0.017 ms, median=0.116 ms, max=1.549 ms, 99th percentile=0.551 ms 12:02:30.147: ┗render_displays: min=0 ms, median=0.297 ms, max=3.429 ms, 99th percentile=1.008 ms 12:02:30.147: ================================================= 12:02:30.147: == Profiler Time Between Calls ================== 12:02:30.147: obs_hotkey_thread(25 ms): min=24.672 ms, median=25.26 ms, max=42.414 ms, 60.3995% within ±2% of 25 ms (0% lower, 39.6005% higher) 12:02:30.147: obs_graphics_thread(33.3333 ms): min=10.824 ms, median=33.333 ms, max=1237.34 ms, 98.6622% within ±2% of 33.333 ms (0.668896% lower, 0.668896% higher)

exeldro avatar Jan 17 '24 11:01 exeldro

When I started to implement this idea, base_set_allocator was an interesting feature for a 3rd party plugin to replace the behavior of bmalloc and bfree to have the memory leak trace. However, even before the deprecation, any 3rd party plugins cannot call base_set_allocator because some memory spaces are allocated by the default bmalloc before the plugin was loaded so that bfree and brealloc cannot identify the space was allocated by the default bmalloc or the overwritten bmalloc. IMHO, I want to have bmem.c in the master branch as simple as possible to avoid potential bugs in future.

norihiro avatar Jan 17 '24 14:01 norihiro

I always wonder what the benefit of this is over using built-in tools of Xcode or MSVC for this? Are leaks generated by our bmem implementation not caught by MSVC or Instruments?

Instruments_Leaks_Tool

PatTheMav avatar Jan 17 '24 16:01 PatTheMav

The benefit of this is that users can run it without installing extra tools to find issues I am not able find local

exeldro avatar Jan 17 '24 16:01 exeldro

The benefit of this is that users can run it without installing extra tools to find issues I am not able find local

Got it - yeah makes sense, though one could make the argument that users should not receive code that leaks in the first place and we can avoid that by running static code analysis and leak checks before releasing software (and the same applies to plugin authors).

The entire leak reporting stuff we have in code is something that should never ship in a release version of software (it's fine in the debug version of it), but I guess that's not a battle I'll win.

From the perf numbers I posted I'm not sure if the deviations aren't within expected bounds and there's no big impact in practice, though I'm still unhappy about shipping what is effectively debug builds to users.

PatTheMav avatar Jan 17 '24 16:01 PatTheMav

I think it would make sense to only enable this in debug mode.

cg2121 avatar Jan 29 '24 19:01 cg2121