xenia icon indicating copy to clipboard operation
xenia copied to clipboard

Potential synchronization bug (flashes in Halo 3) in the Nvidia driver on Ampere

Open Triang3l opened this issue 2 years ago • 3 comments

Validation

  • [X] I've read the FAQ.
  • [X] The Xenia build used is from the master branch (not MLBS/AlexVS/Canary/pull requests, etc.)
  • [X] This issue isn't for tech support (help with Xenia).
  • [X] If this issue occurs in a specific game, I've done analysis to locate the faulty subsystem of the emulator and a potential reason in it.
  • [X] I've checked if this issue hasn't already been reported.
  • [X] My device meets the minimum requirements: https://github.com/xenia-project/xenia/wiki/Quickstart#system-requirements
  • [ ] (If building) I have read the building doc: https://github.com/xenia-project/xenia/blob/master/docs/building.md

Describe what's going wrong

This is more of a note to not forget to report this issue to Nvidia rather than a bug on our side likely.

In Halo 3, as of 55a91afcc7d192f1794e10bed9adbd3c072e2d6d (as well as the earliest version available on GitHub Releases), on the Direct3D 12 backend, with the RTV/DSV render target implementation, and resolution scaling disabled, flashes of various colors appear on the screen randomly, both in the main menu and the gameplay.

Screenshot of a pink flash

Sometimes this issue takes the form of large bloom blobs (usually purple, but sometimes pixels of other colors are taken), sometimes there are white rectangles with defined edges on the screen. These seem to leak the contents of some render targets (especially ones related to bloom) from before some draw (the purple color appears in the EDRAM tile padding on the right side during certain bloom passes), which suggests that some synchronization is missing.

This issue is reproducible on Nvidia GPUs with the Ampere architecture (however, it hasn't been tested on Turing and Volta). Specifically, for me, on the GeForce RTX 3080 Ti, in all my tests on Windows 11 build 22000.675, on the following driver versions (with the default 3D settings with no overrides, after a clean installation):

  • 512.95 (May 24, 2022 — Game Ready Driver)
  • 512.96 (May 23, 2022 — Studio Driver)
  • 472.12 (September 20, 2021 — Game Ready Driver)

@ZolaKluke has also confirmed this on the GeForce RTX 3090.

If frames are captured in PIX or RenderDoc, the bug can be seen in the final screenshot, but when the capture is analyzed, it's not visible in the Present input or anywhere before (at least I wasn't able to reproduce it there by switching between commands). Also, it doesn't appear in RenderDoc if replay looping is launched.

PIX warning analysis and GPU-based validation don't report any issues related to this ~~(aside from missing UNORDERED_ACCESS state for the shared memory buffer in draw commands, but this is related to dynamic switching between the SRV + index buffer and the UAV if memexport is used in the draws, because of which both SRV and UAV are always bound for the shared memory buffer, not something related to render target resolves — however, it's still something to investigate, maybe certain optimizations in the driver rely on actual binding information, we should try binding null descriptors instead of the unused ones)~~. I've checked the resource state transitions happening near resolves and texture loads, they all seem to be correct.

What I've also tried on the code side is inserting a UAV barrier before every transition from the UAV state, as well as not merging multiple barriers into one ResourceBarrier command, however, none of that has helped.

If we don't find the reason of the issue on our side, we need to create some functionality for creating quick reproduction methods for driver developers — most likely frame trace stream replaying with output to a window, as well as starting/stopping tracing — and send a frame trace and the replay application, as well as the source code commit hash and the building instructions, to Nvidia.

Describe what should happen

Screen-space effects should be rendered correctly and in a way that's stable between frames, just like on other hardware this has been tested on (Nvidia GeForce GTX 1070 — Pascal architecture, Intel UHD Graphics 630, AMD Radeon RX Vega 7).

If applicable, provide a callstack here, especially for crashes

No response

If applicable, upload a logfile and link it here

No response

Triang3l avatar Jun 06 '22 13:06 Triang3l

Also happens with slightly differently-looking flashes with the rasterizer-ordered view render backend implementation, though less frequently, probably because draws take a longer time.

Triang3l avatar Jun 07 '22 17:06 Triang3l

Separating the shared memory bindings into an SRV-only table and a UAV-only table did not help unfortunately (even though that has eliminated the remaining correctness issues reported by PIX), looks like purely a driver bug.

Triang3l avatar Oct 23 '22 15:10 Triang3l

Not experiencing this on 531.18 on my local build (with some memexport changes, but they're probably not related).

Triang3l avatar May 03 '23 09:05 Triang3l