vulkano icon indicating copy to clipboard operation
vulkano copied to clipboard

"Interactive Fractal" example is leaking memory on Windows

Open Mesoptier opened this issue 3 years ago • 8 comments

Template

  • Version of vulkano: 0.27.1 (I checked out master at bb2671785d7de4924e2aece17516de91f30e3c65)
  • OS: Windows 11 Home 64-bit
  • GPU (the selected PhysicalDevice): NVIDIA GeForce RTX 3080
  • GPU Driver: NVIDIA driver 496.76

Issue

I'm running cargo run --bin interactive_fractal and seeing the used memory rise at around 2.7MB/s in the task manager (~27 MB in ~10 seconds). The same occurs if I run with the --release flag, but all of the following information is without it.

I used the Windows Performance Recorder program to record about 20 seconds of the example running and you can clearly see the memory rising at a regular rate. All these VirtualAlloc commits seem to stay alive until the program is manually terminated after ~24 seconds.

image

When looking at the rows in the table, I'm noticing a lot (90%+) of commits have the exact same size (0.262 MB):

image Here's the contents of the table in the above screenshot: virtualalloc commit lifetimes.txt (666KB)

The commit stack for each of these is exactly the same:

[Root]
ntdll.dll!RtlUserThreadStart
kernel32.dll!BaseThreadInitThunk
interactive_fractal.exe!__scrt_common_main_seh
interactive_fractal.exe!main
interactive_fractal.exe!std::rt::lang_start<tuple$<> >
interactive_fractal.exe!std::rt::lang_start_internal
interactive_fractal.exe!std::rt::lang_start::closure$0<tuple$<> >
interactive_fractal.exe!std::sys_common::backtrace::__rust_begin_short_backtrace<void (*)(),tuple$<> >
interactive_fractal.exe!core::ops::function::FnOnce::call_once<void (*)(),tuple$<> >
interactive_fractal.exe!interactive_fractal::main
interactive_fractal.exe!interactive_fractal::compute_then_render
interactive_fractal.exe!interactive_fractal::app::FractalApp::compute
interactive_fractal.exe!interactive_fractal::fractal_compute_pipeline::FractalComputePipeline::compute
interactive_fractal.exe!vulkano::command_buffer::auto::AutoCommandBufferBuilder<vulkano::command_buffer::auto::PrimaryAutoCommandBuffer<vulkano::command_buffer::pool::standard::StandardCommandPoolAlloc>,vulkano::command_buffer::pool::standard::StandardCommandPoolBuilder>::build<vulkano::command_buffer::pool::standard::StandardCommandPoolBuilder>
interactive_fractal.exe!vulkano::command_buffer::synced::builder::SyncCommandBufferBuilder::build
interactive_fractal.exe!vulkano::command_buffer::synced::builder::commands::impl$0::bind_pipeline_compute::impl$0::send
interactive_fractal.exe!vulkano::command_buffer::sys::UnsafeCommandBufferBuilder::bind_pipeline_compute
interactive_fractal.exe!ash::vk::features::DeviceFnV1_0::cmd_bind_pipeline
nvoglv64.dll!<PDB not found>
nvoglv64.dll!<PDB not found>
nvoglv64.dll!<PDB not found>
KernelBase.dll!GlobalAlloc
ntdll.dll!RtlpAllocateHeapInternal
ntdll.dll!RtlpLowFragHeapAllocFromContext
ntdll.dll!RtlpAllocateUserBlock
ntdll.dll!RtlpAllocateUserBlockFromHeap
ntdll.dll!RtlpAllocateHeapInternal
ntdll.dll!RtlpAllocateHeap
ntdll.dll!RtlpExtendHeap
ntdll.dll!RtlpFindAndCommitPages
ntdll.dll!NtAllocateVirtualMemory
ntoskrnl.exe!KiSystemServiceCopyEnd
ntoskrnl.exe!NtAllocateVirtualMemory
ntoskrnl.exe!MiAllocateVirtualMemory

If I'm reading this correctly cmd_bind_pipeline calls into nvoglv64.dll (NVIDIA dll?), which does some memory allocation for some reason, but this doesn't get cleaned up until the program exits.


Context: I'm writing my own program based on this specific example (I'm toying with ray marching) and am experiencing the exact same issue there, including the repeated allocations of exactly 0.262MB. In that program I am limiting the FPS to 60, which reduces the amount of memory allocated per second from ~2.7MB/s to ~0.2MB/s. If I disable the frame rate limiting, the memory usages also starts rising at a similar rate to the interactive fractal example.


I really hope this is enough information for someone with more experience with Vulkano/Vulkan (or graphics programming in general... or memory leaks... I'm not in my element here :P) to figure out what is happening here. If more information is required, I'm happy to spend more time on this!

Mesoptier avatar Dec 28 '21 22:12 Mesoptier

It's hard to say whether this is a problem in Vulkano or your graphics driver. Are you able to test it with another driver, and a non-Vulkano program that uses Vulkan?

Rua avatar Dec 29 '21 09:12 Rua

It's hard to say whether this is a problem in Vulkano or your graphics driver. Are you able to test it with another driver, and a non-Vulkano program that uses Vulkan?

I'm not sure about another graphics driver. I know there's Nouveau for Ubuntu, but are there alternatives for Windows?

I will have a look for a similar Vulkan program that doesn't use Vulkano.

Mesoptier avatar Dec 29 '21 10:12 Mesoptier

I have cloned this repository, containing many Vulkan examples. I've tried a bunch of these examples, and none of them seem to leak any memory. In particular I tried the computeraytracing example, which I felt had the most similarity to the interactive_fractal example, due to its use of a compute shader for computing the ray tracing.

I've also ran all the Vulkano examples that use an EventLoop (so they stay alive long enough to observe the memory leak):

  • buffer-pool - no leak
  • clear_attachments - no leak
  • gl-interop - (did not run)
  • indirect - LEAK!
  • instancing - no leak
  • multi-window - no leak
  • occlusion-query - no leak
  • tessellation - no leak
  • triangle - no leak
  • deferred - no leak
  • image - no leak
  • immutable-sampler - no leak
  • interactive_fractal - LEAK!
  • push-descriptors - no leak
  • runtime-shader - (did not run)
  • runtime_array - no leak
  • teapot - no leak

So, only the indirect and interactive_fractal examples seem to leak memory for me. If I'm not mistaken, those are also exactly the only ones that use a compute shader, further suggesting the problem is in that area.

Mesoptier avatar Dec 29 '21 11:12 Mesoptier

cmd_bind_pipeline is a Vulkan API call; it is just the Ash binding for the vkCmdBindPipeline library function. So if the memory allocation is happening within that function (is it?) then it would seem that the driver is the problem. On the other hand, it may be behaving badly because of an error elsewhere in Vulkano that gives the driver incorrect data. Very hard to track down...

Rua avatar Dec 29 '21 11:12 Rua

While I'm also not experienced with leaks, I ran the following and found no leaks: valgrind --tool=memcheck --leak-check=yes cargo run --bin interactive_fractal

Vulkano: master OS: Linux Mint 20.2 GPU: AMD Radeon RX 580 Driver: Mesa/RADV 21.3.2 (Vulkan 1.2.195)

Rua avatar Dec 29 '21 12:12 Rua

When I ran the example on my Ubuntu machine (a laptop without dedicated graphics card), I also didn't perceive any memory leaks. So that would lead me to be inclined to also think it's just a driver issue, except that the non-Vulkano Vulkan examples did run without memory leaks...

I certainly agree that it's very hard to track down. Do you have any other ideas for what I might try to get this figured out?

Mesoptier avatar Dec 29 '21 13:12 Mesoptier

I asked in a Rust group if anyone else can try it out and see if they can reproduce.

Rua avatar Dec 29 '21 13:12 Rua

I have cloned this repository, containing many Vulkan examples. I've tried a bunch of these examples, and none of them seem to leak any memory. In particular I tried the computeraytracing example, which I felt had the most similarity to the interactive_fractal example, due to its use of a compute shader for computing the ray tracing.

I've also ran all the Vulkano examples that use an EventLoop (so they stay alive long enough to observe the memory leak):

  • buffer-pool - no leak
  • clear_attachments - no leak
  • gl-interop - (did not run)
  • indirect - LEAK!
  • instancing - no leak
  • multi-window - no leak
  • occlusion-query - no leak
  • tessellation - no leak
  • triangle - no leak
  • deferred - no leak
  • image - no leak
  • immutable-sampler - no leak
  • interactive_fractal - LEAK!
  • push-descriptors - no leak
  • runtime-shader - (did not run)
  • runtime_array - no leak
  • teapot - no leak

So, only the indirect and interactive_fractal examples seem to leak memory for me. If I'm not mistaken, those are also exactly the only ones that use a compute shader, further suggesting the problem is in that area.

I have also been experiencing interesting memory leak issues running on Windows, and it seems that I am encountering the same tests leaking and not leaking on my machine, with the inclusion of multi_window_game_of_life also leaking for me. I suppose the unifying theme is the inclusion of a compute shader in the rendering process of leaking examples.

In my own experimenting, I've been able to get compute and rendering to work without leaking memory each frame by following the template set in https://vulkano.rs/guide/windowing/event-handling ; specifically, each frame in the swapchain is given a unique future in an array and synchronization is based around this. I haven't been able to identify exactly why this prevents memory leakage, or what is leaking in the examples, but my current suspicions are something along the lines of if a future is dropped before completing, it is losing track of resources, based partly on the phrasing of "If possible, checks whether the submission has finished. If so, gives up ownership of the resources used by these submissions." (emphasis is mine, https://docs.rs/vulkano/0.29.0/vulkano/sync/trait.GpuFuture.html#tymethod.cleanup_finished ).

I'll keep investigating, but thought I'd share what I've learned so far.

ryco117 avatar May 02 '22 01:05 ryco117

Tried to reproduce on Windows 10 with NV 1080 Ti (driver 527.56), but on my machine using https://github.com/vulkano-rs/vulkano/commit/10d734955633aad8fe816d5cd12e6f3728749539 I do not experience a memory leak running either interactive_fractal or indirect.

However the theory by @ryco117 sounds interesting to me and could imply that leaks could happen due to race conditions. @ryco117 do you still experience the leaks using current master?

trevex avatar Dec 10 '22 21:12 trevex

@trevex I did not experience a memory leak on either example when run in both debug and release modes. I think it is fair to say that the recent refactors and cleanup work may be helping 🙂

ryco117 avatar Dec 10 '22 22:12 ryco117

I just tested the latest master on my machine and I no longer experienced any memory leaks in the interactive_fractal example!

I did a quick git bisect and found that the memory leak was fixed in 91dc54413507511bbb9df260ea9b984a5b1dcb67 (nice work @marc0246!)

Mesoptier avatar Dec 11 '22 10:12 Mesoptier