gfxreconstruct icon indicating copy to clipboard operation
gfxreconstruct copied to clipboard

[replay] Possible to crash replay of trimmed capture with orphaned debug utils messenger

Open bradgrantham-lunarg opened this issue 4 months ago • 0 comments

Describe the replay bug:

If an instance is created and a debug utils messenger created on that instance (in CreateVkDevice) during tracking (before WriteState) then DestroyVkDevice is called but the debug messenger was not destroyed, replay will crash. Although replay crashes, the issue is that a command was recorded to the capture file as part of WriteState with a handle for an object that was not created.

While the messenger handle is orphaned, it looks to my naive eyes like this is not invalid usage - the only thing the spec says is

VkInstance objects can be destroyed once all VkDevice objects created from any of its VkPhysicalDevice objects have been destroyed.

Presumably the implementation can walk the children of the VkInstance and mark anything orphaned as "invalid" or "don't use" or something.

Anything created from a VkInstance that isn't a VkDevice could conceivably be leaked in similar circumstances and GFXR would crash attempting to replay that object. I have found these in vulkan_core.h:

  • VkSurfaceKHR from vkCreateDisplayPlaneSurfaceKHR
  • VkDebugReportCallbackEXT from vkCreateDebugReportCallbackEXT
  • VkDebugUtilsMessengerEXT from vkCreateDebugUtilsMessengerEXT
  • VkSurfaceKHR from vkCreateHeadlessSurfaceEXT

It's possible to create an instance, do a lot of stuff with the instance and devices created from the instance, then delete the devices and instances, and those commands don't show up in the capture file. So I am not sure if this is a large omission, in which we need to add some new kind of tracking of things to delete from the state vector when the instance is destroyed, or if this is just a minor omission for one or more of these handle types. Needs a little investigation.

Verify before submission:

  • Was trimming enabled Yes
  • Was replayer renamed if necessary? No
  • Was --sync used if title is known to need forced synchronization? NA

Build Environment: Please include the SHA and PR or branch name used in capture and also used to build the replayer.

9c295b4

To Reproduce Steps to reproduce the behavior: In an app, create an instance, create a debug utils messenger on that instance, and then destroy the instance. Capture with trimming after frame 1. Sample was tested with GFXRECON_CAPTURE_FRAMES=100-200

System environment:

  • GPU and driver version on which capture was taken
  • GPU and driver version on which capture file was replayed with issue Demonstrated on an Android sample and also a different Windows sample

bradgrantham-lunarg avatar Oct 07 '24 23:10 bradgrantham-lunarg