Vulkan.jl icon indicating copy to clipboard operation
Vulkan.jl copied to clipboard

SwapchainKHR auto-finalized before its time (segfault)

Open dmillard opened this issue 1 year ago • 3 comments

Thank you for the carefully designed wrapper, it's really nice.

I can't seem to preserve my swapchain object throughout the life of my program. Although Base.GC.enable_logging(true) tells me that no GC is being run, very consistently Vulkan.set_preferences!("LOG_REFCOUNT" => "true") shows the refcount of my swapchain going to zero before trying to create framebuffers, and then I get a segfault from the validation layers upon trying to access the ImageView.

I've tried adding a GC.@preserve swapchain nothing at the end of my main function and tried to save the swapchain in a struct for later, but neither keeps the refcount above zero. Currently the only thing that works consistently is manually setting swapchain.destructor = _ -> nothing.

Package versions:

    [052768ef] CUDA v5.2.0                                                                                                                              
    [f7f18e0c] GLFW v3.4.1                                                                                                                              
    [90137ffa] StaticArrays v1.9.1                                                                                                                      
    [9f14b124] Vulkan v0.6.14                                                                                                                           
    [6fd5517d] glslang_jll v11.7.0+0    

I'd try to submit an MWE but Vulkan doesn't seem very amenable to short reproducible scripts. I can try if necessary. Thanks!

dmillard avatar Jan 27 '24 08:01 dmillard

Hi, is the program of the following form?

function main()
  instance, device = init(...)
  swapchain = create_swapchain(...)
  view = get_image_view(swapchain, ...)
  # do something with `view`
end

Because swapchain is not used after its creation, finalizers for the swapchain may run at any time, even when the view is still being used.

To circumvent such issues, and fairly similarly to how you would do things in C++, you can insert a finalize(swapchain) at the end of the function. My understanding is that because swapchain is then used as argument to finalize, finalizers won't run before that call, and since that call runs the finalizers directly, you can be certain that they run exactly at that moment. This allows you to clean up the swapchain (or any other object, for that matter) at a known specific point in the program. Generally things work without it, but that's one particular case where one must be careful about implicit variable dependencies.

I guess this is what you tried with GC.@preserve swapchain nothing. I believe this GC.@preserve call only ensured that in the call to the enclosed expression (which is nothing here), the swapchain remained alive. What you might have needed to do would be

GC.@preserve swapchain begin
  # do something with `view`
end

to make sure the swapchain remained alive when you needed it specifically. But I find such solution to be generally inferior to finalize(swapchain).

For short scripts, something you can also do to avoid inserting many finalize calls is to use GC.@preserve with an expression where you can be sure that if the expression terminates execution, all variables can be cleaned up. Such an example is available in the docs: https://juliagpu.github.io/Vulkan.jl/dev/tutorial/minimal_working_compute/#Getting-the-data where a bunch of resources needed for the execution of command buffers remain available until queue operations finish.

I hope this helps!

EDIT: the source code for the doc example is available at https://github.com/JuliaGPU/Vulkan.jl/blob/main/docs/src/tutorial/minimal_working_compute.jl#L301-L303, which is probably more convenient to go through the code.

serenity4 avatar Jan 27 '24 12:01 serenity4

Adding a finalize(swapchain) call in at the end did indeed help! Thank you.

Also I have a

struct VulkanInfo
    swapchain::SwapchainKHR
    ...
end

which I return at the end of my function - so long as it contains everything that's an implicit dependency, it's enough to keep it alive.

I wonder if there's some lightweight way for Julia GC to know to keep the implicit dependency alive so users don't have to think about it? Coming from a C++ background, the GC auto-finalization is much harder to predict than the neat lexical scoping rules of C++. Writing out a bunch of finalize statements is definitely workable but also requires the Vulkan.jl users to know which objects need to be kept alive for which events, which is decidedly a complex system for a new user.

dmillard avatar Jan 30 '24 05:01 dmillard

Good to hear the issue was solved! Note that if you return a struct, you are not guaranteed that its members are preserved; if for example you do something like

function main()
  info = init_vulkan()
  nothing
end

then the compiler is allowed to not even construct the structure in memory (and therefore may free early what would have been its members otherwise), as that operation wouldn't change semantics because info is unused. The structure might be constructed in a given version of Julia, but compiler internals may decide to do otherwise in the future.

The way I usually get around the issue is to create a structure, just like this VulkanInfo, and pass it along a program that eventually disposes of everything in due time, with a function that waits for GPU operations to finish before cleaning things up.

I don't know of any way to declare extra dependencies between objects to the GC, and we can't anyways rely on the GC to finalize things in the proper order (which object is finalized before another one when cleaned up in the same batch is undefined). But, we can probably rely on the reference counting mechanism already in place and extend it so we declare extra dependencies between handles within that mechanism. Something like

function main()
  swapchain = create_swapchain()
  views = get_swapchain_views(swapchain)

  depends_on(views, swapchain)
  # same as
  # for view in views
  #   depends_on(view, swapchain)
  # end

  # do anything with `views`
  # ...
  # `views` gets out of scope, each eventually gets finalized
  # `swapchain may have been finalized before but refcounting ensured the swapchain
  # was destroyed only after each view was finalized
  nothing
end

serenity4 avatar Jan 31 '24 13:01 serenity4

I implemented depends_on as mentioned above here: https://github.com/JuliaGPU/Vulkan.jl/commit/29e253aa15074f591d26e9687f56880c1217a134

It is unexported, but public (though not marked as such with the new public keyword because it does not appear yet very well supported - it causes VSCode to crash when opening the project).

One small difference to note though is that I didn't define it on iterables, so depends_on(views, swapchain) will have to be substituted by

for view in views
  depends_on(view, swapchain)
end

serenity4 avatar Jul 06 '24 13:07 serenity4