radeon-profile icon indicating copy to clipboard operation
radeon-profile copied to clipboard

dxorg.cpp:getValueFromSysFsFile() truncates file content containing \0 chars – leading to segfault in dXorg::parseOcTable() with vega20

Open emerge-e-world opened this issue 3 years ago • 6 comments

After a recent system upgrade radeon-profile segfaults on startup. I don't know if the output of amdgpu in sysfs changed or if this is caused by a change of behavior of QString after a Qt5 upgrade.

I think I tracked down the cause of the issue though:

getValueFromSysFsFile() in dxorg.cpp truncates sysfs file content after the first \0 character. This happens when the QByteArray returned by QFile::readAll() is converted to a QString. As it so happens, my pp_od_clk_voltage separates the different sections of the table with a bunch of null characters. In my case (with a Radeon VII / vega20) this leads to dXorg::parseOcTable() only reading the first section (OD_SCLK). getValueFromSysFsFile() returns:

OD_SCLK:
0:        808Mhz
1:       1801Mhz

while the actual content of /sys/class/drm/card0/device/pp_od_clk_voltage is:

OD_SCLK:
0:        808Mhz
1:       1801Mhz
OD_MCLK:
1:       1000Mhz
OD_VDDC_CURVE:
0:        808Mhz        727mV
1:       1304Mhz        798mV
2:       1801Mhz       1076mV
OD_RANGE:
SCLK:     808Mhz       2200Mhz
MCLK:     800Mhz       1200Mhz
VDDC_CURVE_SCLK[0]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[0]:     738mV        1218mV
VDDC_CURVE_SCLK[1]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[1]:     738mV        1218mV
VDDC_CURVE_SCLK[2]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[2]:     738mV        1218mV

This then leads to the bool variable vega20Mode in dXorg::parseOcTable() be set to false, as there is no OD_VDDC_CURVE section present. Inside the main for loop of that function the wrong branch is taken, resulting in an out of range exception when trying to read the third column of the table (while OD_SCLK only contains two) in the line: 831 fvt.insert(state[0].toUInt(), FreqVoltPair(state[1].toUInt(), state[2].toUInt())); (reading state[2] leads to the segfault).

The fix for me was to handle null characters in sysfs files in getValueFromSysFsFile() by removing all '\0' chars before converting the QByteArray returned from QFile::readAll() to a QString. Here is a simple patch:

--- a/radeon-profile/dxorg.cpp
+++ b/radeon-profile/dxorg.cpp
@@ -61,7 +61,7 @@ QString getValueFromSysFsFile(QString fileName) {
     QString value("-1");
 
     if (f.open(QIODevice::ReadOnly))
-        value = QString(f.readAll()).trimmed();
+        value = QString(f.readAll().replace('\0',"")).trimmed();
 
     f.close();
     return value;

emerge-e-world avatar Nov 24 '21 20:11 emerge-e-world

I was having an identical issue and this fixed it completely! Absolutely brilliant! I know the devs aren't really maintaining this much anymore but maybe try making a PR with this?

whit-colm avatar Nov 27 '21 02:11 whit-colm

Glad that it was of help to you as well! I created a pull request for the fix.

emerge-e-world avatar Dec 02 '21 17:12 emerge-e-world

Ok how do we fix this in our systems? How can we edit dxorg.cpp? I cant find it, where is the location of it? I understand the fix, but cant find dxorg.cpp

I found it, git cloned, edited dxorg.cpp file, I use qmake and make to build it, I build demon also. And I have same problem as before. Radeon-Profile is not reading my manual clocks...

dadaas avatar Dec 13 '21 08:12 dadaas

Glad you found everything. You can now also directly clone from my forked repository (git clone https://github.com/emerge-e-world/radeon-profile), then you don't have to manually patch the file. I am working on a couple more bug fixes / improvement in there. What exactly do you mean not reading your manual clocks? Is it still crashing on startup while trying? Or are the clock speeds just wrong?

emerge-e-world avatar Dec 13 '21 22:12 emerge-e-world

THANK YOU !!! Works with my 6800 XT

however the overclock tab seems to have issues running as normal user I have state's running as root user it does not collect any states

and when trying to change memory speed it adds an insane voltage and does not accept any voltage inputs Screenshot from 2022-09-28 09-35-00 Screenshot from 2022-09-28 09-33-58 Screenshot from 2022-09-28 09-33-34

MasterCATZ avatar Sep 27 '22 23:09 MasterCATZ

I merged the patches from emerge-e-world in my own fork and I'm building toward a new release from there. If you want to test Radeon Profile and Radeon Profile daemon (you need both up to date), have a go over here:

Radeon Profile: https://github.com/Oxalin/radeon-profile Radeon Profile daemon: https://github.com/Oxalin/radeon-profile-daemon

That being said, your problem with your 6800XT with the overclock tab is probably not fixed yet and I'd like to work with you to fix it, since I don't have access to your card (it seems related to NAVI 2X and above).

Oxalin avatar Jan 27 '23 23:01 Oxalin