LACT icon indicating copy to clipboard operation
LACT copied to clipboard

Vega20 and newer GPUs support

Open WyekS opened this issue 4 years ago • 25 comments

System:

  • Manjaro Linux
  • Kernel 5.10.7 (Mirrorlist from Arch)
  • Asus Radeon ROG Strix RX5700 XT

Steps:

  • Install from AUR this package
  • Enable and start service
  • Run application

Trace error:

Initializing gtk
Activating
Getting elements
thread 'main' panicked at 'index out of bounds: the len is 2 but the index is 2', daemon/src/gpu_controller.rs:604:28
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

WyekS avatar Jan 13 '21 19:01 WyekS

This should be fixed now, you can update by rebuilding the AUR package.

Can you post the output of cat /sys/class/drm/card0/device/pp_od_clk_voltage? It seems like the device file is missing some voltage specifications that are usually there.

ilya-zlobintsev avatar Jan 13 '21 20:01 ilya-zlobintsev

Hi!. The output:

OD_SCLK:
0: 800Mhz
1: 2100Mhz
OD_MCLK:
1: 875MHz
OD_VDDC_CURVE:
0: 800MHz 711mV
1: 1450MHz 801mV
2: 2100MHz 1191mV
OD_RANGE:
SCLK:     800Mhz       2150Mhz
MCLK:     625Mhz        950Mhz
VDDC_CURVE_SCLK[0]:     800Mhz       2150Mhz
VDDC_CURVE_VOLT[0]:     750mV        1200mV
VDDC_CURVE_SCLK[1]:     800Mhz       2150Mhz
VDDC_CURVE_VOLT[1]:     750mV        1200mV
VDDC_CURVE_SCLK[2]:     800Mhz       2150Mhz
VDDC_CURVE_VOLT[2]:     750mV        1200mV

WyekS avatar Jan 14 '21 17:01 WyekS

Cloning the repo and installing with deploy.sh, the application runs. So, AUR package is outdate.

WyekS avatar Jan 14 '21 17:01 WyekS

The AUR package builds from git, so it can't be outdated. You may need to tell your AUR helper to rebuild the package since it doesn't pick up the update automatically, but when you build it it builds the latest version.

ilya-zlobintsev avatar Jan 14 '21 18:01 ilya-zlobintsev

Amm... ok!, so the problem was mine :D and my AUR helper. Thank you very much for your time!

WyekS avatar Jan 14 '21 22:01 WyekS

Apparently Vega20 and newer GPUs use a different format for the clocks/voltage file, so you won't be able to change those for now. I'll see what I can do to support it (I don't have such a GPU so I can't test that it works for sure)

ilya-zlobintsev avatar Jan 15 '21 07:01 ilya-zlobintsev

If you need any information (files, logs...) or need me to try something else, just ask me

WyekS avatar Jan 15 '21 12:01 WyekS

@WyekS is this still relevant? I've pushed what should be a working basic support for newer GPUs.

ilya-zlobintsev avatar Feb 27 '21 06:02 ilya-zlobintsev

Yes, it still is. I'm going to test it and if it works correctly I will report to you.

WyekS avatar Mar 01 '21 11:03 WyekS

It looks like you're running an old version. You should update.

ilya-zlobintsev avatar Mar 04 '21 17:03 ilya-zlobintsev

My fault again. GPU voltage doesn't appear to be modifiable: image

WyekS avatar Mar 04 '21 17:03 WyekS

That's currently not supported. I just wanted to know if the clockspeeds are getting applied properly.

ilya-zlobintsev avatar Mar 04 '21 19:03 ilya-zlobintsev

It seems to set them correctly. But I have had several crashes when applying frequencies with a game running. The lactd service stopped and I've had to relaunch it. I am launching lact-gui from the terminal to capture more information but so far it has worked correctly.

WyekS avatar Mar 06 '21 11:03 WyekS

Please post the output of journalctl -u lactd -e when the daemon crashes.

ilya-zlobintsev avatar Mar 06 '21 12:03 ilya-zlobintsev

the daemon crashes after a while journalctl -u lactd -e

Mär 06 21:10:41 nalis-linux systemd[1]: Started AMDGPU Control Daemon. Mär 06 21:10:42 nalis-linux lact-daemon[21065]: WARNING: radv is not a conformant vulkan implementation, testing use only. Mär 06 21:20:44 nalis-linux lact-daemon[21065]: thread 'main' panicked at 'Accept failed: Sys(EMFILE)', daemon/src/lib.rs:181:66 Mär 06 21:20:44 nalis-linux lact-daemon[21065]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Mär 06 21:20:44 nalis-linux systemd[1]: lactd.service: Main process exited, code=exited, status=101/n/a Mär 06 21:20:44 nalis-linux systemd[1]: lactd.service: Failed with result 'exit-code'.

also this happens after about 10 mins if i start from terminal and leave the gui open

lact-gui Connection to daemon established thread '' panicked at 'Socket failed: Sys(EMFILE)', daemon/src/daemon_connection.rs:72:10 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

(lact-gui:21073): Gtk-WARNING **: 20:19:53.297: Could not load a pixbuf from icon theme. This may indicate that pixbuf loaders or the mime database could not be found. Gtk:ERROR:../gtk/gtk/gtkiconhelper.c:494:ensure_surface_for_gicon: assertion failed (error == NULL): Failed to load /home/nalis/.local/share/icons/WhiteSur-dark/actions/16/image-missing.svg: Error opening file /home/nalis/.local/share/icons/WhiteSur-dark/actions/16/image-missing.svg: Too many open files (g-io-error-quark, 31) Bail out! Gtk:ERROR:../gtk/gtk/gtkiconhelper.c:494:ensure_surface_for_gicon: assertion failed (error == NULL): Failed to load /home/nalis/.local/share/icons/WhiteSur-dark/actions/16/image-missing.svg: Error opening file /home/nalis/.local/share/icons/WhiteSur-dark/actions/16/image-missing.svg: Too many open files (g-io-error-quark, 31)

NalianNalis avatar Mar 06 '21 20:03 NalianNalis

There was a really nasty bug causing socket connections to not be cleaned up properly. The crashes should be gone now with the latest commit @Llorrin @WyekS

ilya-zlobintsev avatar Mar 07 '21 05:03 ilya-zlobintsev

@ilyazzz

the daemon still crashes. but no GTK erros anymore or gui crash. i did the following steps.

` Update Package AUR

[nalis@nalis-linux ~]$ sudo systemctl disable --now lactd

Reboot

[nalis@nalis-linux ~]$ sudo systemctl enable --now lactd

Created symlink /etc/systemd/system/graphical.target.wants/lactd.service → /usr/lib/systemd/system/lactd.service.

[nalis@nalis-linux ~]$ sudo systemctl start lactd

[nalis@nalis-linux ~]$ lact-gui

Connection to daemon established

about 10 min later

thread '' panicked at 'connect failed: Sys(ECONNREFUSED)', daemon/src/daemon_connection.rs:80:50 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

[nalis@nalis-linux ~]$ journalctl -u lactd -e

--Boot Mär 07 07:47:31 nalis-linux systemd[1]: Started AMDGPU Control Daemon. Mär 07 07:47:31 nalis-linux lact-daemon[2530]: WARNING: radv is not a conformant vulkan implementation, testing use only. Mär 07 07:57:33 nalis-linux lact-daemon[2530]: thread 'main' panicked at 'Accept failed: Sys(EMFILE)', daemon/src/lib.rs:181:66 Mär 07 07:57:33 nalis-linux lact-daemon[2530]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Mär 07 07:57:33 nalis-linux systemd[1]: lactd.service: Main process exited, code=exited, status=101/n/a Mär 07 07:57:33 nalis-linux systemd[1]: lactd.service: Failed with result 'exit-code'.

[nalis@nalis-linux ~]$ sudo RUST_LOG=trace lact-daemon

[2021-03-07T07:25:41Z INFO daemon] Loaded config from /etc/lact.json [2021-03-07T07:25:41Z INFO daemon] Using config Config { gpu_configs: {4097127804: (GpuIdentifier { pci_id: "0000:0c:00.0", card_model: None, gpu_model: Some("Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]"), path: "/sys/class/drm/card0/device" }, GpuConfig { fan_control_enabled: false, fan_curve: {20: 0.0, 40: 0.0, 60: 50.0, 80: 80.0, 100: 100.0}, power_cap: -1, power_profile: Auto, gpu_max_clock: 0, gpu_max_voltage: None, vram_max_clock: 0 }), 3663728266: (GpuIdentifier { pci_id: "0000:0c:00.0", card_model: Some("AMD RX 6800 16GB"), gpu_model: Some("Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]"), path: "/sys/class/drm/card0/device" }, GpuConfig { fan_control_enabled: false, fan_curve: {20: 0.0, 40: 0.0, 60: 50.0, 80: 80.0, 100: 100.0}, power_cap: -1, power_profile: Auto, gpu_max_clock: 0, gpu_max_voltage: None, vram_max_clock: 0 })}, allow_online_update: Some(false), config_path: "/etc/lact.json", group: "wheel" } [2021-03-07T07:25:41Z INFO daemon] Initializing "/sys/class/drm/card0" WARNING: radv is not a conformant vulkan implementation, testing use only. [2021-03-07T07:25:41Z TRACE pciid_parser] Parsing pci.ids [2021-03-07T07:25:41Z TRACE pciid_parser] Seacrhing vendor 1002 [2021-03-07T07:25:41Z TRACE pciid_parser] Found vendor Advanced Micro Devices, Inc. [AMD/ATI] [2021-03-07T07:25:41Z TRACE pciid_parser] Searching device 73bf [2021-03-07T07:25:41Z TRACE pciid_parser] Found device Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [2021-03-07T07:25:41Z TRACE pciid_parser] Searching subdevice 1002 0e3a [2021-03-07T07:25:41Z TRACE pciid_parser] Found subvendor Advanced Micro Devices, Inc. [AMD/ATI] [2021-03-07T07:25:41Z INFO daemon::gpu_controller] Vendor data: VendorData { gpu_vendor: Some("Advanced Micro Devices, Inc. [AMD/ATI]"), gpu_model: Some("Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]"), card_vendor: Some("Advanced Micro Devices, Inc. [AMD/ATI]"), card_model: None } [2021-03-07T07:25:41Z TRACE daemon::hw_mon] setting power cap to -1000000 [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:25:41Z INFO daemon] Searching the config for GPU with identifier GpuIdentifier { pci_id: "0000:0c:00.0", card_model: None, gpu_model: Some("Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]"), path: "/sys/class/drm/card0/device" } [2021-03-07T07:25:41Z INFO daemon] 2 [2021-03-07T07:25:41Z INFO daemon] Comparing with GpuIdentifier { pci_id: "0000:0c:00.0", card_model: None, gpu_model: Some("Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]"), path: "/sys/class/drm/card0/device" } [2021-03-07T07:25:41Z TRACE daemon::hw_mon] setting power cap to -1000000 [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:25:41Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:25:41Z INFO daemon] already known [2021-03-07T07:25:41Z INFO daemon::config] saving { "gpu_configs": { "4097127804": [ { "pci_id": "0000:0c:00.0", "card_model": null, "gpu_model": "Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]", "path": "/sys/class/drm/card0/device" }, { "fan_control_enabled": false, "fan_curve": { "20": 0.0, "40": 0.0, "60": 50.0, "80": 80.0, "100": 100.0 }, "power_cap": -1, "power_profile": "Auto", "gpu_max_clock": 0, "gpu_max_voltage": null, "vram_max_clock": 0 } ], "3663728266": [ { "pci_id": "0000:0c:00.0", "card_model": "AMD RX 6800 16GB", "gpu_model": "Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]", "path": "/sys/class/drm/card0/device" }, { "fan_control_enabled": false, "fan_curve": { "20": 0.0, "40": 0.0, "60": 50.0, "80": 80.0, "100": 100.0 }, "power_cap": -1, "power_profile": "Auto", "gpu_max_clock": 0, "gpu_max_voltage": null, "vram_max_clock": 0 } ] }, "allow_online_update": false, "config_path": "/etc/lact.json", "group": "wheel" } [2021-03-07T07:26:01Z TRACE daemon] Reading buffer [2021-03-07T07:26:01Z TRACE daemon] Attempting to deserialize [0, 0, 0, 0] [2021-03-07T07:26:01Z TRACE daemon] Executing action CheckAlive [2021-03-07T07:26:01Z TRACE daemon] Responding, buffer length 8 [2021-03-07T07:26:01Z TRACE daemon] Finished responding [2021-03-07T07:26:01Z TRACE daemon] Reading buffer [2021-03-07T07:26:01Z TRACE daemon] Attempting to deserialize [1, 0, 0, 0] [2021-03-07T07:26:01Z TRACE daemon] Executing action GetConfig [2021-03-07T07:26:01Z TRACE daemon] Responding, buffer length 535 [2021-03-07T07:26:01Z TRACE daemon] Finished responding [2021-03-07T07:26:01Z TRACE daemon] Reading buffer [2021-03-07T07:26:01Z TRACE daemon] Attempting to deserialize [3, 0, 0, 0] [2021-03-07T07:26:01Z TRACE daemon] Executing action GetGpus [2021-03-07T07:26:01Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:26:01Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:26:01Z TRACE daemon] Responding, buffer length 71 [2021-03-07T07:26:01Z TRACE daemon] Finished responding [2021-03-07T07:26:01Z TRACE daemon] Reading buffer [2021-03-07T07:26:01Z TRACE daemon] Attempting to deserialize [4, 0, 0, 0, 124, 53, 53, 244] [2021-03-07T07:26:01Z TRACE daemon] Executing action GetInfo(4097127804) [2021-03-07T07:26:01Z TRACE daemon::gpu_controller] Reading clocks table [2021-03-07T07:26:01Z TRACE daemon::gpu_controller] Parsing line [2021-03-07T07:26:01Z TRACE daemon] Responding, buffer length 2645 [2021-03-07T07:26:01Z TRACE daemon] Finished responding

. . . .

[2021-03-07T07:34:28Z TRACE daemon] Finished responding [2021-03-07T07:34:28Z TRACE daemon] Reading buffer [2021-03-07T07:34:28Z TRACE daemon] Attempting to deserialize [5, 0, 0, 0, 124, 53, 53, 244] [2021-03-07T07:34:28Z TRACE daemon] Executing action GetStats(4097127804) [2021-03-07T07:34:28Z TRACE daemon] Responding, buffer length 20 [2021-03-07T07:34:28Z TRACE daemon] Finished responding thread 'main' panicked at 'Accept failed: Sys(EMFILE)', daemon/src/lib.rs:181:66 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace [nalis@nalis-linux ~]$

`

NalianNalis avatar Mar 07 '21 07:03 NalianNalis

@Llorrin my bad, apparently the issue was still present in the daemon. Hopefully it's fixed for good now. Also, you don't need to reboot after updating, restarting the service is enough.

ilya-zlobintsev avatar Mar 07 '21 07:03 ilya-zlobintsev

The reason why it crashes after some time: Each time the app gets something from the daemon (e.g. updating displayed stats) it connects to the daemon. The issue was that the old connections weren't getting removed properly, and it took about 10 minutes to fill the system's file descriptor limit. Now the connections (hopefully) get cleaned up properly, so this won't happen.

ilya-zlobintsev avatar Mar 07 '21 08:03 ilya-zlobintsev

@ilyazzz so far looks good i started it from terminal and it is still running with no issues. 20 mins so far, old version was crashing in about 10 mins, i have also a request, is it possible to read the junction temp?

NalianNalis avatar Mar 07 '21 08:03 NalianNalis

@Llorrin the thermals page should now show all of the temperature sensors.

ilya-zlobintsev avatar Mar 07 '21 12:03 ilya-zlobintsev

@ilyazzz there is a small "cosmetic bug" the order switches constantly displaying first junction then edge temp it will be better if it was displayed below and not next to each other.

and for some odd reason now i have current fan speed reading i dint had that before it was showing 0,

Screenshot_20210307_140710

NalianNalis avatar Mar 07 '21 13:03 NalianNalis

@Llorrin fixed.

ilya-zlobintsev avatar Mar 07 '21 17:03 ilya-zlobintsev

@ilyazzz

is this normal behavior?

after reboot the service is not running, i need to do the following steps. is there a conflict ?

[nalis@nalis-linux ~]$ lact-gui
Connection to daemon established thread 'main' panicked at 'called Option::unwrap() on a None value', gui/src/app/header.rs:55:60 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace [nalis@nalis-linux ~]$ sudo systemctl restart lactd [sudo] password for nalis: [nalis@nalis-linux ~]$ journalctl -u lactd -e

-- Boot 668ba34d10af4a2cb3b7d8f7c95baedb -- Mär 08 19:23:30 nalis-linux systemd[1]: Started AMDGPU Control Daemon. Mär 08 19:24:33 nalis-linux systemd[1]: Stopping AMDGPU Control Daemon... Mär 08 19:24:33 nalis-linux systemd[1]: lactd.service: Succeeded. Mär 08 19:24:33 nalis-linux systemd[1]: Stopped AMDGPU Control Daemon. Mär 08 19:24:33 nalis-linux systemd[1]: Started AMDGPU Control Daemon. Mär 08 19:24:33 nalis-linux lact-daemon[2272]: WARNING: radv is not a conformant vulkan implementation, testing use only.

[nalis@nalis-linux ~]$ RUST_BACKTRACE=1 lact-gui Connection to daemon established thread 'main' panicked at 'called Option::unwrap() on a None value', gui/src/app/header.rs:55:60 stack backtrace: note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace. [nalis@nalis-linux ~]$ RUST_BACKTRACE=full lact-gui Connection to daemon established thread 'main' panicked at 'called Option::unwrap() on a None value', gui/src/app/header.rs:55:60 stack backtrace: 0: 0x561d8f89f7c0 - 1: 0x561d8f8be25c - 2: 0x561d8f899595 - 3: 0x561d8f8a1995 - 4: 0x561d8f8a14ea - 5: 0x561d8f8a2131 - 6: 0x561d8f8a1c47 - 7: 0x561d8f89fc7c - 8: 0x561d8f8a1bd9 - 9: 0x561d8f8bc8e1 - 10: 0x561d8f8bc82d - 11: 0x561d8f492bf0 - 12: 0x7f20aa3e49ca - g_signal_emit_valist 13: 0x7f20aa3e4b40 - g_signal_emit 14: 0x7f20aa968627 - 15: 0x7f20aa9689e2 - gtk_combo_box_set_active 16: 0x561d8f49402b - 17: 0x561d8f49d67b - 18: 0x561d8f499a28 - 19: 0x561d8f4984e3 - 20: 0x561d8f498549 - 21: 0x561d8f8a2647 - 22: 0x561d8f499bf2 - 23: 0x7f20a9f46b25 - __libc_start_main 24: 0x561d8f49105e - 25: 0x0 - [nalis@nalis-linux ~]$

also fan reading is gone again,

Screenshot_20210308_192147

NalianNalis avatar Mar 08 '21 18:03 NalianNalis

Hello After the previous fixed, Lact worked correctly as I said. Arch is not my main system for playing games, so I only tested it in a couple of short sessions, no errors with the configuration I set up in Lact. Also the last fixed shows the temperatures correctly.

My main goal was to set up a passive configuration on the graphics card and I have succeeded. The issue was solved for me Thanks for this application @ilyazzz

PS: I miss being able to set the voltage and a visual version number (or tags) in Lact.

WyekS avatar Mar 22 '21 17:03 WyekS

The latest version now supports editing the voltage.

ilya-zlobintsev avatar Feb 25 '23 13:02 ilya-zlobintsev