amdgpu-clocks
amdgpu-clocks copied to clipboard
card0 isn't a valid path, sometimes GPU is mounted as card1
Expected behavior:
GPU clocks are set by writing to card0 by /etc/default/amdgpu-custom-states.card0
Actual behavior:
Nov 01 14:34:27 WS.local amdgpu-clocks[13908]: ls: cannot access '/sys/class/drm/card0/device/hwmon': No such file or directory
Nov 01 14:34:27 WS.local amdgpu-clocks[13902]: WARNING: /sys/class/drm/card0/device/pp_od_clk_voltage does not exist, skipping!
The solution could be to use the actual PCI path instead of a dynamic path.
hi @Vixtron
Thanks for reporting, but this particular issue has nothing to do with this very project, the amdgpu-clocks
is not assigning cardX
numbers by itself, it is Linux kernel and its driver modules that does that. Speaking of which, what cards do you have in your system, and what is your kernel and drivers are you using for those cards? Do you perhaps use an external Thunderbolt GPU enclosure, or some kind of notebook with a combination of iGPU & dGPU or similar?
As a potential workaround; verify which of your multiple cards are you changing clocks for, and double check that it is the same card that is consistently toggling between card0
and card1
, and then just symlink /etc/default/amdgpu-custom-state.card0
to /etc/default/amdgpu-custom-state.card1
. That would ensure that same settings would be applied to your card, regardless if kernel sees it as card0
or card1
.
hi @Vixtron
Thanks for reporting, but this particular issue has nothing to do with this very project, the
amdgpu-clocks
is not assigningcardX
numbers by itself, it is Linux kernel and its driver modules that does that. Speaking of which, what cards do you have in your system, and what is your kernel and drivers are you using for those cards? Do you perhaps use an external Thunderbolt GPU enclosure, or some kind of notebook with a combination of iGPU & dGPU or similar?As a potential workaround; verify which of your multiple cards are you changing clocks for, and double check that it is the same card that is consistently toggling between
card0
andcard1
, and then just symlink/etc/default/amdgpu-custom-state.card0
to/etc/default/amdgpu-custom-state.card1
. That would ensure that same settings would be applied to your card, regardless if kernel sees it ascard0
orcard1
.
I only have 1 dedicated card - RX580 and I'm using the amdgpu driver, since I updated to kernel 6.0.5 I noticed after rebooting that my card was mounted as card1 and my clocks were not being applied.
I only have 1 dedicated card - RX580
Your screenshot suggest otherwise. What does the ls -alh /sys/class/drm
say? And lspci
?
and I'm using the amdgpu driver
Yes, of course, but which amdgpu driver? Mainline kernel, distro specific, pro, something else? What does modinfo amdgpu
say?
Tried the workaround?
I only have 1 dedicated card - RX580
Your screenshot suggest otherwise. What does the
ls -alh /sys/class/drm
say? Andlspci
?and I'm using the amdgpu driver
Yes, of course, but which amdgpu driver? Mainline kernel, distro specific, pro, something else? What does
modinfo amdgpu
say?Tried the workaround?
lspci output:
I'm running the open source kernel amdgpu driver of course.
/lib/modules/6.0.5-200.fc36.x86_64/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz
Now you can see my GPU is mounted as card0 after I rebooted the pc, next time I reboot it will be card1 for some reason, no I haven't tried a workaround, but someone told me to try symlinking the card1 to card0 in case it mounts itself wrong but I don't see that as a good idea - as for the PCI path I would assume it would be the same issue.
next time I reboot it will be card1 for some reason
When that happens, what are the ls -alh /sys/class/drm
and lspci
saying?
someone told me to try symlinking the card1 to card0 in case it mounts itself wrong but I don't see that as a good idea
Someone told you what exactly? What about the potential workaround I told you about? So far it is the only idea that can help your case, please try that.
next time I reboot it will be card1 for some reason
When that happens, what are the
ls -alh /sys/class/drm
andlspci
saying?someone told me to try symlinking the card1 to card0 in case it mounts itself wrong but I don't see that as a good idea
Someone told you what exactly? What about the potential workaround I told you about? So far it is the only idea that can help your case, please try that.
Today I rebooted and it shows this
I don't think your symlink solution will work, because the symlink will be overridden by the card0 or card1 each time the PC reboots, maybe if I could symlink directories card1 -> card0
and card0 -> card1
it would work and I don't know if that is possible.
I don't think your symlink solution will work, because the symlink will be overridden by the card0 or card1 each time the PC reboots, maybe if I could symlink directories card1 -> card0 and card0 -> card1 it would work and I don't know if that is possible.
If you bother to read it properly you'll come to understanding that I ain't suggesting symlinking any /sys/class/drm
directories at all, that wouldn't make any sense...
What I am suggesting is to make a symlink (or just plain good old copy) of an amdgpu-custom-state
file, so that amdgpu-clocks would try to apply identical custom settings to both card0
and card1
, every time it runs. Obviously, that would work for just one card, depending on which identifier is currently assigned to a card by the driver (it would just throw an error about the other, missing, card identifier), but it should give you the result you want.
Using a symlink works. I have an amd card plus the intel iGPU. The card numbers seem to be random each boot. I went into /etc/default and did ln -s amdgpu-custom-states.card0 amdgpu-custom-states.card1
and the settings get applied no matter which card number gets assigned. The downside is that it will try to apply the settings to the intel card, fail, and then apply them to the amd card.