dxvk icon indicating copy to clipboard operation
dxvk copied to clipboard

wip: Memory defragmentation

Open doitsujin opened this issue 1 year ago • 7 comments

Builds on all the reworks from the past couple of weeks to implement memory defragmentation.

Things still to do

  • [ ] Get rid of the format conversion context in the D3D9 front-end. Some D3D9 games may break before this is solved.
  • [x] Put a limit on the number of resources to relocate at once, processing thousands in one go isn't free and might lead to frame time spikes.
  • [x] Only consider chunk allocations rather than the entire heap when deciding whether to defrag at all.
  • [ ] Test this a whole bunch
  • [x] Fix some useless error messages when we can't relocate a resource
  • [x] Work around some Nvidia driver bug which causes all apps to hang as soon as defragmentation happens (#4380)

What this does

As described in #4280, our memory allocator works on chunks of 256MB that are allocated from the system. The goal here is to make more efficient use of these chunks and actually return memory back to the system if we have a lot of unused memory sitting around. This is especially important under memory pressure (even more so on Nvidia due to the need of dedicated allocations for e.g. render targets), or if the app in question isn't actually a game but rather a launcher that temporarily eats over 1GB of VRAM but frees most of it when getting minimized.

As an example, Metaphor ReFantazio (the demo version in this case) allocates over 4GB of memory right at the start, just to free most of it right away. Because some small allocations are scattered all across the memory chunks, we still need to keep the full 4.6GB of memory around: Bildschirmfoto-696

With defragmentation, we get a lot closer to what the game is actively using: Bildschirmfoto-697

This only affects VRAM allocations that are not mapped into CPU address space. For CPU-accessible memory we require pointer stability, so moving those around dynamically is not feasible - that said, most mapped allocations are short-lived anyway so the problem usually solves itself.

The algorithm used here is very simple, we periodically look at the chunk that has the lowest amount of memory used and try to move those resources to existing chunks, while preventing the allocator from reusing that chunk until the memory is actually needed. This way, we essentially produce empty chunks which can subsequently be freed. While not optimal in any way, this generally seems to work well in practice.

What this does not (yet) do

We don't migrate any resources between VRAM and system memory yet. This is planned as a future PR and will likely be necessary in order to make Unity Engine games work better on cards with less than 12GB of VRAM (e.g. #4118).

This also means that if we do currently allocate a resource in system memory, we will not move it back to VRAM even if enough space would be available, so performance in those games is still going to be an issue.

doitsujin avatar Oct 18 '24 13:10 doitsujin

Is it the same thing that affects Final Fantasy XVI on steam deck (it eats 11 GB RAM and 7 GB VRAM) ?

Will this solve the problem on AMD cards with less VRAM memory.

Mupli avatar Oct 19 '24 05:10 Mupli

@doitsujin Hi, was wonder if you could answer my previous questions. Thx.

Mupli avatar Oct 20 '24 11:10 Mupli

Note that it is weekend and he is currently away.

This feature mostly "only" help in situations where a game might temporarily spike in VRAM usage or if it holds on to some allocations that were previously in use but no longer is. E.g going between zones or menus in a game. Though that might also happen during the initial loading screen.

Current master, which this builds on, already should be better in regards to VRAM used amount versus allocated amount. The benefit differs per game obviously.

Blisto91 avatar Oct 20 '24 11:10 Blisto91

@Mupli it will change nothing for FFXVI because that uses D3D12.

mbriar avatar Oct 20 '24 12:10 mbriar

@Blisto91 - There is no weekend for a passionate devs :D @mbriar - got it. Thx. On AMD game tries to fill 12GB VRAM (PS 5 port) into 4-8 GB cards. I felt like this could help. But yeah different tech.

Mupli avatar Oct 20 '24 12:10 Mupli

@Blisto91 - There is no weekend for a passionate devs :D @mbriar - got it. Thx. On AMD game tries to fill 12GB VRAM (PS 5 port) into 4-8 GB cards. I felt like this could help. But yeah different tech.

Have you given SteamOS 3.6 and Proton Experimental a shot? I'm interested to know!

jams3223 avatar Oct 21 '24 11:10 jams3223

Please keep comments unrelated to this PR to a minimum.

Blisto91 avatar Oct 21 '24 11:10 Blisto91

our memory allocator works on chunks of 256MB that are allocated from the system.

Mr. Philip, would changing to 192mb bring an improvement, or is it small enough not to be worth it?

ViNi-Arco avatar Oct 23 '24 13:10 ViNi-Arco

I don't see why reducing the chunk size would help with anything, we still want to avoid having to put large resources into their own vulkan allocations and small chunks increase overall fragmentation.

256 works well in practice, and for apps that don't use a lot of VRAM, we even start with much smaller chunks now to reduce memory waste, this was a change in 2.4.1 primarily intended to improve the whole game launcher situation.

doitsujin avatar Oct 23 '24 14:10 doitsujin