drm-kmod icon indicating copy to clipboard operation
drm-kmod copied to clipboard

Update to linux 6.10

Open dumbbell opened this issue 9 months ago • 22 comments

This is the backport of the DRM drivers from Linux 6.10.

Progress:

Changes in Linux 6.10

You can read this Phoronix article to learn about the changes in the DRM drivers in Linux 6.10: https://www.phoronix.com/news/Linux-6.10-DRM-Graphics

Patches to linuxkpi

This update depends on the following patches to linuxkpi in FreeBSD.

These patches are maintained in the following repository and branch: https://github.com/dumbbell/freebsd-src/tree/drm-related-linuxkpi-changes

Patches were submitted for review:

  • [x] https://reviews.freebsd.org/D54487
  • [ ] https://reviews.freebsd.org/D54488
  • [x] https://reviews.freebsd.org/D54489
  • [x] https://reviews.freebsd.org/D54490
  • [x] https://reviews.freebsd.org/D54491
  • [x] https://reviews.freebsd.org/D54492
  • [x] https://reviews.freebsd.org/D54493
  • [x] https://reviews.freebsd.org/D54494
  • [x] https://reviews.freebsd.org/D54495
  • [x] https://reviews.freebsd.org/D54496
  • [x] https://reviews.freebsd.org/D54497
  • [x] https://reviews.freebsd.org/D54498
  • [x] https://reviews.freebsd.org/D54499
  • [x] https://reviews.freebsd.org/D54500
  • [x] https://reviews.freebsd.org/D54501
  • [x] https://reviews.freebsd.org/D54502
  • [ ] https://reviews.freebsd.org/D54503

Firmware updates

There is no associated firmware update for now (to be checked).

How to test

You need to run a recent FreeBSD 15-CURRENT to test it.

Here are some instructions:

  1. You need to checkout the FreeBSD src branch I mentionned, drm-related-linuxkpi-changes, and compile a kernel from that branch:

    git clone -b drm-related-linuxkpi-changes https://github.com/dumbbell/freebsd-src.git
    cd freebsd-src
    make -j8 buildkernel DEBUG_FLAGS=-g
    
    # This installs the kernel under another name, `kernel.drm`. Thus, you keep the default kernel
    # in case of trouble.
    sudo make installkernel DEBUG_FLAGS=-g INSTKERNNAME=kernel.drm
    
  2. You need to checkout the branch referenced in this pull request and compile it:

    git clone -b update-to-linux-6.10 https://github.com/dumbbell/drm-kmod.git
    cd drm-kmod
    make -j8 DEBUG_FLAGS=-g SYSDIR=/path/to/freebsd-src-from-step1/sys
    sudo make install DEBUG_FLAGS=-g SYSDIR=/path/to/freebsd-src-from-step1/sys KMODDIR=/boot/kernel.drm
    
  3. Load the relevant driver(s) as you usually do.

dumbbell avatar Aug 09 '25 14:08 dumbbell

The port is complete and all linuxkpi patches were pushed to my work freebsd-src branch. This is ready for testing :-)

It’s working fine with amdgpu for me for the past two days. The regression with vt(4) went away so it was in the driver itself, not with our integration code.

I got a panic with i915 (memory used after free), but couldn’t reproduce so far. I will use i915 for a few days at work to see if I can get it again.

dumbbell avatar Sep 17 '25 21:09 dumbbell

And I got a panic almost right away with i915 in the radix-tree code I modified for this update. I will look at it tonight hopefully.

dumbbell avatar Sep 18 '25 07:09 dumbbell

I just pushed a fix to one of the radix-tree changes that fixes a panic with the i915 driver. I couldn’t use today because of this. Hopefully, I will run it the whole day tomorrow.

dumbbell avatar Sep 18 '25 16:09 dumbbell

Good work @dumbbell

One question 6.9 and 6.10 have support for Intel Arc A Series dGPU? Or this take more time and test?

yukiteruamano avatar Sep 19 '25 02:09 yukiteruamano

I don't know about Intel Arc, I never studied what they need. I heard people tried a few versions ago but it didn't work for them.

You could try again with the i915 driver if you have this card :-) We don't have a port of the xe driver at this point.

dumbbell avatar Sep 19 '25 07:09 dumbbell

I don't know about Intel Arc, I never studied what they need. I heard people tried a few versions ago but it didn't work for them.

i915 since 6.1 have support for this card on Linux. 6.2 is the recommended kernel for daily use.

I test 6.1, 6.6 and 6.8 kmod on FreeBSD and don't work.

You could try again with the i915 driver if you have this card :-) We don't have a port of the xe driver at this point.

Xe is not necessary for Intel Arc A or B series, this support is marked Experimental on Linux

yukiteruamano avatar Sep 21 '25 20:09 yukiteruamano

I don't know about Intel Arc, I never studied what they need. I heard people tried a few versions ago but it didn't work for them.

You could try again with the i915 driver if you have this card :-) We don't have a port of the xe driver at this point.

According to @wulf7 in #315 i915 in drm-kmod does not support Arc discrete graphics because it requires porting of "MEI" and "PXP" drivers. I'm not sure what PXP is, but it certainly looks like there's plenty of PXP code in this repo. As for MEI, this seems to be Management Engine Interface. It seems strange to me that this should be necessary to support an Arc GPU. Because afaik Arc should work fine in linux with an AMD chipset/CPU, no?.

As for testing, I've tested at least up to 6.8 with my A770, with no success. OTOH, on linux I believe experimental support was present in 5.8 and much better support in 6.4, so in theory, it should just work, but no one has been able to get it to work.

I'd love some more detailed information on what is missing, if @wulf7 or anyone else knows. I can't make any promises, but I'd like to at least try to push this through, but I don't know where to start; or even how to figure out where to start.

As for Xe, is there any interest in starting to port Xe at some point? Xe first appeared in linux 6.8, and it'll probably have to be ported at some point, so the sooner we start, the less work it would be, I imagine. Is anyone aware of whether Xe has the same blockers that i915 does, for Arc GPUs?

mtlll avatar Oct 01 '25 08:10 mtlll

According to @wulf7 in #315 i915 in drm-kmod does not support Arc discrete graphics because it requires porting of "MEI" and "PXP" drivers.

This is for support GPU Sched and offloading for video encode. These options can be disabled using enable_guc=0 and the GPU work with this config.

Because afaik Arc should work fine in linux with an AMD chipset/CPU, no?.

Exactly, right now I'm using Ryzen and Intel Arc, work without MEI.

I'm not sure what PXP is, but it certainly looks like there's plenty of PXP code in this repo.

PXP drivers is built-in into the i915 and Xe modules.

As for testing, I've tested at least up to 6.8 with my A770, with no success

Same here, 6.8 don't work, firmware is loaded, but I have kernel panic related with memory faults when load i915 DRM on FreeBSD-current.

Interestingly, I'm having a similar problem on OpenBSD. Intel Arc support is disabled there, but with a slight kernel modification, it loads, detects the card, but doesn't init the GPU and only work with scfb driver.

To start i915 driver, it requires another modification to the kernel, and then the same problem: it recognizes the card, initializes it and a kernel panic related to memory management.

I highlight this because it is striking that both systems initializing the card have the same final behavior: kernel panics related to memory management problems.

yukiteruamano avatar Oct 01 '25 21:10 yukiteruamano

Same here, 6.8 don't work, firmware is loaded, but I have kernel panic related with memory faults when load i915 DRM on FreeBSD-current.

You may try to apply both https://github.com/freebsd/drm-kmod/issues/315#issuecomment-2480526639 and https://github.com/freebsd/drm-kmod/issues/315#issuecomment-2480947595 patches to fix memory fault. That may allow to go slightly further. In https://github.com/freebsd/drm-kmod/issues/315 the patches resulted in GPU hang instead of kernel crash.

wulf7 avatar Oct 05 '25 12:10 wulf7

I'm testing this PR on a Radeon RX 580 right now. Not sure if this has been reported yet, but calling poweroff does not actually fully turn off the system and keeps it running. The uptime message does show up as the last console log before the monitors turn off, ~but the system keeps running afterwards.~

There is a much simpler repro with just kldload amdgpu, then kldunload amdgpu and observe that no output shows on the monitor. I am not able to ping the system afterwards so I assume that it panics.

svmhdvn avatar Oct 18 '25 03:10 svmhdvn

Hello! What is the status on this, are there some blockers or regressions that need addressing? (I read the comments but I couldn't figure it out.)

alice-sowerby avatar Nov 25 '25 15:11 alice-sowerby

The 6.10 update is basically ready.

I don’t think the problems with i915 are related to this update. They exist for quite some time for some of them. I could make progress with the colors corruption and the poweroff issue, but I didn’t identified the actuel culprit yet.

As for the issue with amdgpu reported by svmhdvn, I didn’t reproduce it yet.

I’m struggling with non-FreeBSD stuff, which prevents me from making fast progress currently.

dumbbell avatar Nov 26 '25 19:11 dumbbell

@svmhdvn is your issue a regression (or this is the first time you've tested)?

emaste avatar Dec 05 '25 16:12 emaste

That was the first time I tested it. I'll test again as soon as I get physical access to the test target. Can anyone else reproduce it?

svmhdvn avatar Dec 06 '25 03:12 svmhdvn

https://github.com/freebsd/drm-kmod/pull/371#issuecomment-3417771306

⁦… calling poweroff does not actually fully turn off the system …

From https://www.reddit.com/r/freebsd/comments/1p6rak8/comment/nqte36o/:

Try this, before the next poweroff:

systctl hw.efi.poweroff=0

grahamperrin avatar Dec 07 '25 00:12 grahamperrin

I don’t think the problems with i915 are related to this update. They exist for quite some time for some of them. I could make progress with the colors corruption and the poweroff issue, but I didn’t identified the actuel culprit yet.

@dumbbell

It's not exactly the case. I have tested 6.6, 6.7, 6.8, 6.9 and 6.10 on two laptops:

  1. Intel Meteor Lake (so called Ultra Series 1) - PCI ID: 7D55.
  2. Intel Arrow Lake (so called Ultra Series 2) - PCI ID: 7D67.

For Meteor Lake results are (it was supported in 6.6 but behind "force_probe" argument):

  • all drm kmods load the firmware.
  • 6.6 kldloads fine but screen is garbled in console, Mesa gives an error "MESA: cannot get intel_device_info", x11 can be started but no acceleration due to previous Mesa error.
  • 6.7, 6.8, 6.9 and 6.10 - consistently give me panic on kldload or shortly after or trying to start x11 or sway.

For Arrow Lake results are:

  • 6.7, 6.8, 6.9 and 6.10 - consistently give me panic on kldload or shortly after or trying to start x11 or sway.

So, at least for Meteor Lake, there was no panic in 6.6. But it panics consistently on 6.7, 6.8, 6.9 and 6.10.

I thought that below table from Intel could be useful to devs since it sums up what GPUs were declared stable on which kernel version. It also gives information which of them are out of support and not being worked on. First column is nice - PCI ID.

https://dgpu-docs.intel.com/devices/hardware-table.html

as400l avatar Dec 11 '25 11:12 as400l

Been running this with amdgpu on a Framework 13 AMD without obvious problems so far.

christosmarg avatar Dec 11 '25 12:12 christosmarg

After a week of testing, I've only managed to hit this error, which killed Xorg, luckily no panic. Haven't managed to track this down yet though, and I'm not entirely sure if it's related to these changes here:

[drm ERROR :amdgpu_job_timedout] ring gfx_0.0.0 timeout, signaled seq=541767, emitted seq=541769
[drm ERROR :amdgpu_job_timedout] Process information: process  pid 117246 thread  pid 117246
drmn0: GPU reset begin!
drmn0: MES failed to respond to msg=REMOVE_QUEUE
[drm ERROR :amdgpu_mes_unmap_legacy_queue] failed to unmap legacy queue
[drm ERROR :gfx_v11_0_cp_gfx_enable] failed to halt cp gfx
drmn0: MODE2 reset
drmn0: GPU reset succeeded, trying to resume
[drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
[drm] VRAM is lost due to GPU reset!
drmn0: SMU is resuming...
drmn0: SMU is resumed successfully!
[drm] DMUB hardware initialized: version=0x08004800
[drm] kiq ring mec 3 pipe 1 q 0
drmn0: [drm] jpeg_v4_0_hw_initdrmn0: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
drmn0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
drmn0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
drmn0: ring comp_1.2.0 uses VM inv eng 6 on hub 0
drmn0: ring comp_1.3.0 uses VM inv eng 7 on hub 0
drmn0: ring comp_1.0.1 uses VM inv eng 8 on hub 0
drmn0: ring comp_1.1.1 uses VM inv eng 9 on hub 0
drmn0: ring comp_1.2.1 uses VM inv eng 10 on hub 0
drmn0: ring comp_1.3.1 uses VM inv eng 11 on hub 0
drmn0: ring sdma0 uses VM inv eng 12 on hub 0
drmn0: ring vcn_unified_0 uses VM inv eng 0 on hub 8
drmn0: ring jpeg_dec uses VM inv eng 1 on hub 8
drmn0: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
drmn0: recover vram bo from shadow start
drmn0: recover vram bo from shadow done
drmn0: GPU reset(2) succeeded!
[drm] *ERROR*
pid 4591 (Xorg), jid 0, uid 0: exited on signal 6 (no core dump - sugid process denied by kern.sugid_coredump)

christosmarg avatar Dec 18 '25 15:12 christosmarg

Hey Folks, since there is no linuxkpi patches listed as being needed can freebsd-src:main be the base used for testing this MR (instead of: drm-related-linuxkpi-changes https://github.com/dumbbell/freebsd-src.git)? also suspect this MR needs to be rebased to pickup recent changes in drm-kmod:master

benjsc avatar Jan 02 '26 23:01 benjsc

Impressive, thank you.

With 15-CURRENT at the outset here: will base stable/15 (e.g. 1500505 or greater) be in scope when 6.10 reaches the ports collection?

Or will 16.0-CURRENT be required?

(I can hide or delete this comment after an update in the Foundation area.)

grahamperrin avatar Jan 04 '26 16:01 grahamperrin

linuxkpi patches will be ported to stable/15 after they are merged into main. The goal is to support the latest version in FreeBSD 15.1-RELEASE.

dumbbell avatar Jan 04 '26 16:01 dumbbell

Hey Folks, since there is no linuxkpi patches listed as being needed can freebsd-src:main be the base used for testing this MR (instead of: drm-related-linuxkpi-changes https://github.com/dumbbell/freebsd-src.git)? also suspect this MR needs to be rebased to pickup recent changes in drm-kmod:master

I just submitted patches to linuxkpi for review. They are now listed in the pull request description.

For now, you have to use the drm-related-linuxkpi-changes branch.

dumbbell avatar Jan 04 '26 16:01 dumbbell

Unlike what GitHub says, this pull request was merged as is.

dumbbell avatar Jan 28 '26 22:01 dumbbell