gamemode
gamemode copied to clipboard
Allow setting CPU afinity
Is your feature request related to a problem? Please describe. When running on Ryzen CPUs certain games benefit from setting the CPU affinity. One such example is Middle Earth - Shadow of Mordor ( GamingOnLinux.com article, Reddit post on my getting from 32 to 40 FPS min frame rate .
Describe the solution you'd like Launching a game that is known to benefit from setting the CPU affinity on Ryzen platforms should automatically do so.
Describe alternatives you've considered Manually editing the launch arguments is possible, but
- it's tedious to do for every game
- it's not easily discoverable - I don't necessarily know I should enable it for certain games
- it's not "one size fits all" - different core counts require different arguments ( i.e.
taskset 0-5forR5 {1,2}600{,X},taskset 0-7forR7 {1,2}700{,X}, etc )
Additional context No more context, but thanks for working on this and making it open source :-)
This sort of looks like trying to disable threading within a core on Ryzen, or limiting the game to just one CPU package in the dual package design of Ryzen. I was thinking about adding a patch but I don't like the idea of hard coding CPU types to affinity values. There must be some algorithmic approach...
This sort of looks like trying to disable threading within a core on Ryzen, or limiting the game to just one CPU package in the dual package design of Ryzen
Exactly
I don't like the idea of hard coding CPU types to affinity values. There must be some algorithmic approach...
Would a dump of /proc/cpuinfo help? I've attached it to the issue.
From the dump, I don't see how we could detect the number of CPU groups in the package. I think you would better need to look at /sys somewhere.
EDIT: Please show tree -L 3 /sys/devices/system/cpu/.
Hmm, doesn't look too promising but I'm no expert on this structure... Could you inspect the files in the directories named "topology" and see if there's something promising?
Also, are there Ryzen processors out there without the dual-package design? It could help to compare those...
I fear we need an expert on this or more input.
EDIT: Maybe grep ^ /sys/devices/system/cpu/*/topology/*
Maybe grep ^ /sys/devices/system/cpu//topology/
Also, are there Ryzen processors out there without the dual-package design?
AFAIR the 2200g and 2400g have an integrated GPU so only one "core" should be active.
Meanwhile, you could see if #56 helps (only if you're running a MuQSS patched kernel like CK kernel). It could help the game dominating single CPU cores.
Thanks for the note on #56. I don't have a patched kernel so I can't access it. While we're on the subject of things that might help, might this be achieved by running scripts? I was thinking of something like:
- for all 'start' scripts, pass the PID of the game and the executable name ( as cli args or environment variables )
- the script can then decide whether to call
taskset -p ...for that new process based on the executable name
More a bandaid than a proper solution, but can still get the job done.
So according to your initial taskset example, we are looking at changes between cpu5 and cpu6:
/sys/devices/system/cpu/cpu5/topology/core_id:2
/sys/devices/system/cpu/cpu5/topology/core_siblings:0fff
/sys/devices/system/cpu/cpu5/topology/core_siblings_list:0-11
/sys/devices/system/cpu/cpu5/topology/physical_package_id:0
/sys/devices/system/cpu/cpu5/topology/thread_siblings:0030
/sys/devices/system/cpu/cpu5/topology/thread_siblings_list:4-5
/sys/devices/system/cpu/cpu6/topology/core_id:4
/sys/devices/system/cpu/cpu6/topology/core_siblings:0fff
/sys/devices/system/cpu/cpu6/topology/core_siblings_list:0-11
/sys/devices/system/cpu/cpu6/topology/physical_package_id:0
/sys/devices/system/cpu/cpu6/topology/thread_siblings:00c0
/sys/devices/system/cpu/cpu6/topology/thread_siblings_list:6-7
What looks strange (interesting?) is that core_id=3 seems to be missing. But this may only mean that AMD disabled cores that are actually on the die.
might this be achieved by running scripts?
Probably yes, but I didn't look into the scripting support of GameMode yet. It's not very well documented apparently. If you can work it out, feel free to submit example documentation. :-)
What looks strange (interesting?) is that core_id=3 seems to be missing. But this may only mean that AMD disabled cores that are actually on the die.
Since the Ryzen design (guessing) is 4 cores per die for all CPUs I assume that an R5 gets 3 cores per die, with core 4 and 7 being disabled, so we get cores 0,1,2 for die 1 and 4,5,6 for die 2.
Probably yes, but I didn't look into the scripting support of GameMode yet. It's not very well documented apparently. If you can work it out, feel free to submit example documentation. :-)
Good point :-) But looking at the code there are no parameters being passed in. I'll look into it, maybe over the weekend.
Maybe open a new issue, point out the code and suggest ways of passing info of your desire to the script.
Since the Ryzen design (guessing) is 4 cores per die for all CPUs I assume that an R5 gets 3 cores per die, with core 4 and 7 being disabled, so we get cores 0,1,2 for die 1 and 4,5,6 for die 2.
AFAIR I've read about it which kind of confirms your guessing. According to this, R5 is the "bad CPU" where some dies didn't pass the R7 tests, will be disabled, and then maybe pass the R5 tests. Maybe there are even hacks to re-enable such cores - at your own risk.
Maybe open a new issue, point out the code and suggest ways of passing info of your desire to the script.
Just added #57 as a feature request.
I'm on the way to closing https://github.com/FeralInteractive/gamemode/issues/57 with some script improvements. But I still think this should stay open because it would be a nice addition to the features if GameMode could auto-detect scenarios somehow when a game would benefit from CPU affinity. One heuristic could be to detect the package set of the CPU, tho I don't know how to do that. I lack the hardware to properly test that (my CPU is hyper-threaded, it does not have interlinked dual-package CPU cores).
Games that know that they benefit from sticking certain threads to a single CPU core should do that on themselves. While working on scheduler optimizations in wine, I discovered that many Windows games use Windows API infrastructure to actually do that. So, the general idea is already there. We should not try to handle that case.
In this regard, the problem with such a heuristic approach would be that games locked to one package could no longer benefit from organizing themselves onto different cores if the game knows it could benefit from that. In the end, the game usually knows better.
So a heuristic should also include a default list of games known to benefit from that. And only when such a game is running AND a dual-package CPU was detected, inject the optimization into the running process. The list of such games would probably be finite anyways because modern games should be designed around such modern CPUs.
Has there been any progress on implementing taskset support? My needs aren't quite the same as OP but this seems relevant.
My experience from playing games in wine is that windows developers are very keen on setting thread affinity. I don't know much about Windows so maybe it's a big win there, but it almost always causes some pretty bad lag spikes for me until I reset the affinity of the process's threads via taskset.
I have been working on some Wine patches to add the ability for games to set affinity by themselves. I'm not sure if the implementation is correct or could help your case. The patches probably need to be rebased (it's planned but I cannot currently do that). If you're interested, you can follow my repo:
https://github.com/kakra/wine-proton
Part of that is this commit: https://github.com/kakra/wine-proton/commit/0784251793ece840629cab11878f110dfe9f00fb
Changelog summaries of kernel 5.3 say that some info from /sys may now actually show the topology of CPUs. Anyone finding a difference to previous results?
^ I've attached the results of running rep ^ /sys/devices/system/cpu/*/topology/* with Kernel 5.3.6 . I'm running a 3700X.
A kind of hacky way to find which cores are on the same CCX and which are not is to measure core-to-core latency, for example with core-latency. Cores on seperate CCX's have significantly higher latency. This might be helpful if resorting to /sys is proving to be difficult
The latency thing might actually be a quite good heuristic because it would work universally...