Proton icon indicating copy to clipboard operation
Proton copied to clipboard

X3D CPU performance optimization

Open Baughn opened this issue 2 years ago • 6 comments

Feature Request

I confirm:

  • [X] that I haven't found another request for this feature.
  • [X] that I have checked whether there are updates for my system available that contain this feature already.

Description

On an X3D CPU, WINE_CPU_TOPOLOGY--if not already set--should be set to limit visible CPU cores to only those with V-cache available.

Here's an example Python script that would do it:

import subprocess
import xml.etree.ElementTree as ET
from collections import defaultdict

def run_lstopo():
    try:
        lstopo_output = subprocess.check_output(['lstopo', '--of', 'xml'], text=True)
        return ET.fromstring(lstopo_output)
    except Exception as e:
        print(f"An error occurred while running lstopo: {e}")
        return None

def parse_lstopo_xml_to_dict(root):
    core_to_cache = defaultdict(int)
    
    for l3cache in root.findall(".//object[@type='L3Cache']"):
        cache_size = int(l3cache.get("cache_size")) // (1024 * 1024)  # Converting to MB
        for pu in l3cache.findall(".//object[@type='PU']"):
            core_id = int(pu.get("os_index"))
            core_to_cache[core_id] = cache_size
            
    return core_to_cache

def filter_cores_by_max_cache(core_to_cache):
    max_cache = max(core_to_cache.values())
    return [core for core, cache in core_to_cache.items() if cache == max_cache]

if __name__ == "__main__":
    root = run_lstopo()
    if root:
        core_to_cache_dict = parse_lstopo_xml_to_dict(root)
        cores_with_max_cache = filter_cores_by_max_cache(core_to_cache_dict)
        
        # Sorting the core IDs
        sorted_cores_with_max_cache = sorted(cores_with_max_cache)
        
        # Generating the WINE_CPU_TOPOLOGY setting
        num_cores = len(sorted_cores_with_max_cache)
        core_ids_str = ",".join(map(str, sorted_cores_with_max_cache))
        wine_cpu_topology = f"WINE_CPU_TOPOLOGY=\"{num_cores}:{core_ids_str}\""
        
        print(f"Core to Cache mapping: {core_to_cache_dict}")
        print(f"Logical cores with max cache: {cores_with_max_cache}")
        print(f"Generated WINE_CPU_TOPOLOGY setting: {wine_cpu_topology}")

Justification [optional]

On an X3D CPU, for the vast majority of games, forcing the game to run only on the V-cache cores improves performance significantly. In some cases (e.g, Stationeers) this can be a 100% FPS improvement.

On Windows, AMD's game mode driver ensures this by shutting off (!) half the CPU, when a Steam game is run on a CPU with V-cache available on some but not all cores. (E.g, the 7950X3D.)

On Linux this can be done through taskset or by shutting off the cores through /sys/devices/cpu, but I got the best results by using WINE_CPU_TOPOLOGY to limit apparent hardware thread counts & core affinity to only the v-cache cores. This allows other processes to keep running, and allows worker thread tuning to match the actually available resources.

Risks [optional]

There's probably one or two games somewhere in the library that do better with all 16 cores available.

References [optional]

Appendix

FPS, measured in a mature Stationeers base under maximally CPU-hungry conditions.

Using taskset --cpu-list:

  • Default affinity (0-31): 31 FPS. Well, actually 35 for the first two seconds; thermal throttling is a big issue with this CPU.
  • 0-7 (first CCD only): 41 FPS
  • 0-7,16-23: 39 FPS (! disabling hyperthread siblings helps, but leaving them available might scale better as base grows)
  • 0-15: 30 FPS, with intermittent drops down to 23 (once per second or so) (This disables hyperthreading but uses both CCDs)
  • 15-23: 38 FPS, unsurprisingly
  • 24-31: 34 FPS? Interesting. This is the non-vcache CCD. Must be lots of cache coherency traffic.
  • 8-15: 35 FPS. Fair enough.

Using WINE_CPU_TOPOLOGY:

  • 0-7,16-23: 50 FPS minimum, spiking up to 60.

Baughn avatar Oct 06 '23 17:10 Baughn

Question since you seem very knowing about this and I plan to purchase one of these CPU's myself, how does cutting off core access to the others improve things? Should it not use the vcache cores + the rest by default? Or is this just preventing it from randomly choosing the non cached cores.

Bitwolfies avatar Oct 06 '23 21:10 Bitwolfies

It's preventing it from randomly choosing the non-vcache cores, but also telling Proton that it should only use the vcache cores.

You could do the first half of that with taskset, but then Proton (and by extension the game) would still believe that all 16 cores are available, even though they're not. Using WINE_CPU_TOPOLOGY kills two geese with the same carrot.

Baughn avatar Oct 06 '23 22:10 Baughn

It's preventing it from randomly choosing the non-vcache cores, but also telling Proton that it should only use the vcache cores.

You could do the first half of that with taskset, but then Proton (and by extension the game) would still believe that all 16 cores are available, even though they're not. Using WINE_CPU_TOPOLOGY kills two geese with the same carrot.

Gotcha, sounds like a great change if implemented, much better than the windows implementation of just killing half the cores outright through magical game detection. How does one go about using your script with Proton? Or is this an actual patch to the WINE_CPU_TOPOLOGY command?

Bitwolfies avatar Oct 06 '23 23:10 Bitwolfies

I don't know enough about Proton to say; that's why this is a feature request without a PR.

You can run the script as-is, and it'll output a WINE_CPU_TOPOLOGY line. You can then get that into the environment of Steam, by whatever means, and it'll affect every game you launch. I'm on NixOS, so I just set it in configuration.nix.

Baughn avatar Oct 07 '23 03:10 Baughn

I don't know enough about Proton to say; that's why this is a feature request without a PR.

You can run the script as-is, and it'll output a WINE_CPU_TOPOLOGY line. You can then get that into the environment of Steam, by whatever means, and it'll affect every game you launch. I'm on NixOS, so I just set it in configuration.nix.

Ah, I get the purpose of the script now, nicely written, ill keep it around for when I get my CPU, thank you.

Bitwolfies avatar Oct 07 '23 03:10 Bitwolfies

Hi, I switched to Linux (CachyOS) 1 Week ago. I'm only lightyears away from programming such a code myself ; ) . Especially eversince I'm sitting here now, and still asking myself how to implement that Code savely and correctly.

Sila-Secla avatar Jun 21 '24 21:06 Sila-Secla

All you need to do is run the provided script, then get the printed value into Steam's environment somehow. The best way to do so will depend on your distribution, and I've never used yours.

Baughn avatar Jul 23 '24 23:07 Baughn