dxvk icon indicating copy to clipboard operation
dxvk copied to clipboard

[d3d11] Don't perform extraneous queue flushes on NVIDIA

Open liam-middlebrook opened this issue 4 years ago • 8 comments

An API Capture of Assassins Creed 3 shows a significant performance drop of 34% (averaged across tested GPUs in farm) after commit 34fd16b8f22af023abe23cb68daafc23d6f23bbe

As an alternative to this change (which effectively reverts the aforementioned commit for NVIDIA), I tested the following other changes.

  • Tweaking MinFlushIntervalUs to values of 1500us and 3000us
  • Tweaking IncFlushIntervalUs to values of 500us and 1000us
  • Tweaking MaxPendingSubmits to values of 1, 2, and 4
  • Changing the StrongHint argument for these flushes to be FALSE.

The only combination of changes that seemed to produce favorable results were to have these flushes not be strongly hinted, and decrease the maximum number of weak pending submissions to 2 (or less).

Perf testing - RTX 2060 SUPER - Driver 510.54 Before: 170 FPS After: 274 FPS

liam-middlebrook avatar Mar 17 '22 18:03 liam-middlebrook

If this commit causes such a large performance hit, that's probably not exclusive to Nvidia. As far as I understand it, the commit was basically just made speculative, so it's probably a better idea to just revert it entirely.

K0bin avatar Mar 17 '22 19:03 K0bin

Queue flushing like this is not really a vendor specific action, so ye, should see the impact on radv too

misyltoad avatar Mar 17 '22 22:03 misyltoad

I don't have any hardware that's supported by RADV, so I can't confirm there; but I'm happy to change the commit to just be a revert of https://github.com/doitsujin/dxvk/commit/34fd16b8f22af023abe23cb68daafc23d6f23bbe if that's the direction people want.

liam-middlebrook avatar Mar 18 '22 00:03 liam-middlebrook

How many queue submissions per frame are we talking about here?

Reverting that commit in question seems like a rather bad idea since it can lead unstable performance in games that are limited by GPU->CPU readbacks. I guess we can implement the stall detection logic we have in place for queries though to reduce flushes.

doitsujin avatar Mar 18 '22 13:03 doitsujin

The real game seems to be sitting at a handful of submissions per frame even in CPU-bound scenarios, which seems reasonable: Bildschirmfoto

Does the capture replay thing you're using insert a lot of staging buffer copies or something?

doitsujin avatar Mar 19 '22 19:03 doitsujin

I tried to launch the full game, but kept hitting UPlay login issues.

Anyways, the HUD stats from my run show similar submission count, my change knocks out a single submission.

Before: 2022-03-21-185206_3840x2160_scrot

After: 2022-03-21-185221_3840x2160_scrot

I guess we can implement the stall detection logic we have in place for queries though to reduce flushes.

That seems reasonable; out of curiosity how often are you seeing games have a usage-pattern that would be impacted by that?

liam-middlebrook avatar Mar 22 '22 05:03 liam-middlebrook

Do you know where those 2 queue syncs are coming from?

K0bin avatar Mar 22 '22 08:03 K0bin

Hm, this makes me wonder if either the code in FlushImplicit itself is causing extra overhead, or if the submission timing is just unfortunate, especially since there seem to be readbacks of some kind going on.

I pushed a branch here, does this improve things again?

out of curiosity how often are you seeing games have a usage-pattern that would be impacted by that?

Everything that stalls on resource readback is potentially affected. Not a huge issue most of the time but I've seen a bunch of games do something like this:

/* depth pass, some compute stuff */
ctx->CopyResource(stagingBuffer, gpuResource);
/* g-buffer pass, lighting etc */
ctx->Map(stagingBuffer, 0, D3D11_MAP_READ, ...);

Flushing on CopyResource in situations like this is beneficial since we won't randomly have to wait for an additional render pass to complete on the GPU that way.

doitsujin avatar Mar 22 '22 10:03 doitsujin

master completely reworked queue submission logic, so this is no longer useful.

doitsujin avatar Jan 17 '23 14:01 doitsujin