Felix Kuehling

Results 90 comments of Felix Kuehling

DKMS does exactly what you need. It compiles the kernel module from source on your system against your kernel headers. You can modify the source code in /usr/src/amdgpu/... and rebuild...

Hi Andreas, it looks like you're trying to make our DKMS/release branch work with the latest upstream kernels. That's not really the purpose of that branch. The purpose is to...

I would recommend a reboot. Removing the module on a running system is tricky. You'd need to kill any Xserver or compositor running on your system and disconnect the framebuffer...

What you're asking for is not end-user documentation. More like architectural or developer on-boarding documentation. Unfortunately there isn't much public information of that kind, other than comments in the source...

This was also just found and fixed internally. I don't think the fix made it into ROCm 4.5, which was just released last night. It should make it into the...

The AER recover message looks like a general PCI device enumeration problem. Maybe your PSU or power circuitry on our mother boards isn't able to supply three cards with enough...

BAR resizing can fail due to lack of PCI resources available to the bridge that the GPU is connected to. But AFAIK the driver should then continue with the original...

This shows that the VRAM and doorbell BARs don't have valid addresses on GPUs 25:00.0 and 2a:00.0. All their BARs are disabled. The next question is, whether this is caused...

As far as I can gather from this thread, there is nothing ROCm-specific about the problem. The BARs are disabled even without the ROCm kernel driver loaded. I looked up...

Looks like powerplay is not enabled on your GPU. Can you post a full dmesg log?