qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

Use generic modesetting driver instead i915/i965 as default

Open donob4n opened this issue 6 years ago • 22 comments

Qubes OS version:

R4.1

Affected component(s):

Xorg, fixes graphical artifacts with KDE (and others desktops)


Steps to reproduce the behavior:

From https://groups.google.com/d/msgid/qubes-users/5cc49553-b12c-4e4d-7601-f961330a14e6%40gmail.com

I had some graphical artifacts with current stable kernel but testing latest version they become more problematic (it can not draw the main menu, task bar does not properly refresh when switching virtual desktops and also some app windows stops redrawing after some time).

I noticed that it also fixed some ghost clicks from some windows to another (specially if they are full screen).

Also I have better system tray icons than with the other driver but I use "border1" mode. I tried to test default mode and it was pretty bad although I am not sure if it was worse or better than i915 driver.

General notes:

According to https://www.phoronix.com/scan.php?page=news_item&px=Fedora-Xorg-Intel-DDX-Switch this seems the current default driver in Fedora so probably Qubes dom0 should adopt the same decision.


Related issues:

https://github.com/QubesOS/qubes-issues/issues/3267

donob4n avatar Jan 31 '19 14:01 donob4n

This should resolve itself automatically once we update dom0 in R4.1.

marmarek avatar Jan 31 '19 22:01 marmarek

On my X1 carbon 3rd generation, running R4.1, I was getting graphical artifacts in gnome-terminal and firefox when scrolling. Switching back to the "intel" xorg driver from the modesetting driver seems to fix this problem. Thank you to drpfef on IRC for suggesting this.

That is, I added the following to /etc/X11/xorg.conf.d/20-intel.conf

Section "Device"
  Identifier "Intel Graphics"
  Driver "intel"
EndSection

I think that perhaps something like this could be added to https://github.com/Qubes-Community/Contents/blob/master/docs/troubleshooting/intel-igfx-troubleshooting.md for R4.1 users who are having trouble with the new modesetting driver? I didn't submit a pull request for the docs myself because so far this is just my one case.

dmoerner avatar Nov 22 '21 21:11 dmoerner

Just wanted to add that I also see this graphics corruption with the modesetting driver on a T460 (HD Graphics 520), although for some reason it's much rarer.

dmoerner avatar Jan 10 '22 21:01 dmoerner

Thanks @dmoerner! Switching from modesetting to intel fixed the problem for me. Without this, I was seeing artifacts on Qubes 4.1 on my T430i (Intel HD 4000):

animated-preview-of-artifacts

I also get glitches at the boot password prompt. I tried fiddling with i915.mitigations=off, but that makes no difference.

AlxHnr avatar Mar 10 '22 20:03 AlxHnr

As an additional data point - I also observe something very similar (streak-like artifacts on changed/highlighted portions of the screen, only in the GUI part - e.g. boot trace and Ctr+Alt+F2 console work fine) on my Intel ADL-H-based laptop. The default or explicit "modesetting" driver exhibit that and only the "intel" one fixes the problem (none provide acceleration though). That's on kernel-latest from the stable repo (5.16.13 as of right now), but relatively old Mesa packages that dom0 provides.

alt3r-3go avatar Apr 10 '22 17:04 alt3r-3go

As an additional data point - I also observe something very similar (streak-like artifacts on changed/highlighted portions of the screen, only in the GUI part - e.g. boot trace and Ctr+Alt+F2 console work fine) on my Intel ADL-H-based laptop. The default or explicit "modesetting" driver exhibit that and only the "intel" one fixes the problem (none provide acceleration though). That's on kernel-latest from the stable repo (5.16.13 as of right now), but relatively old Mesa packages that dom0 provides.

@marmarek is there any chance we can provide a newer Mesa?

DemiMarie avatar Apr 11 '22 02:04 DemiMarie

@marmarek is there any chance we can provide a newer Mesa?

If proven to solve the issue first, then maybe. But that's quite a few packages to repackage/rebuild, and I'm not going to do it "just in case". sys-gui-gpu may be helpful with testing various versions. Anyway, since the issue applies to relatively old hardware too (especially - way older than fc32 we have in dom0), it's unlikely the upgrade would help.

marmarek avatar Apr 11 '22 03:04 marmarek

Yeah, I'm not sure Mesa is the culprit here, TBH. One thing I forgot to mention is that booting the same dom0 directly (i.e. commenting out Xen and changing module directives to linux and initrd respectively in the Grub menu) yields no artifacts - so it looks like that added variable of Xen changes something, though I don't see anything significant in the log diff.

There's still no HW acceleration when booted directly though - and that's where newer Mesa would probably help, but that's orthogonal to the original issue reported in this thread and I just mentioned that for completeness.

alt3r-3go avatar Apr 17 '22 11:04 alt3r-3go

Yeah, I'm not sure Mesa is the culprit here, TBH. One thing I forgot to mention is that booting the same dom0 directly (i.e. commenting out Xen and changing module directives to linux and initrd respectively in the Grub menu) yields no artifacts - so it looks like that added variable of Xen changes something, though I don't see anything significant in the log diff.

That is interesting. I wonder if disabling the IOMMU for the i915 integrated GPU would help. @marmarek is it safe to do this, on the assumption that the iGPU is trusted?

DemiMarie avatar Apr 17 '22 12:04 DemiMarie

I also get glitches at the boot password prompt. I tried fiddling with i915.mitigations=off, but that makes no difference.

I have this on one device too. There, starting just Linux (without Xen) helps(*), but iommu=no-igfx does not.

(*) there are no glitches, but when the prompt appears, I need to press ESC twice to make it update after key presses - otherwise no got appears when entering the passphrase. Could be totally unrelated issue to the graphics driver.

marmarek avatar Apr 19 '22 10:04 marmarek

I also get glitches at the boot password prompt.

In my instance, https://github.com/torvalds/linux/commit/bdd8b6c98239cad fixes the issue. Unfortunately, I've seen regressions elsewhere caused by this commit (https://github.com/QubesOS/qubes-issues/issues/7479).

marmarek avatar Jul 08 '22 13:07 marmarek

@alt3r-3go @AlxHnr can you test any of the 5.18.x kernel-latest package? It includes the commit mentioned above, and also a follow up fix for it.

marmarek avatar Jul 08 '22 13:07 marmarek

@alt3r-3go @AlxHnr can you test any of the 5.18.x kernel-latest package? It includes the commit mentioned above, and also a follow up fix for it.

Does that follow up fix fix #7479?

DemiMarie avatar Jul 09 '22 02:07 DemiMarie

Does that follow up fix fix #7479?

Yes. But when both are applied on top of 5.15.52, it brings back glitches on plymouth (on this specific hw). Which is kind of expected as the follow up fix un-does https://github.com/torvalds/linux/commit/bdd8b6c98239cad from i915 driver point of view... There is probably some other relevant commit somewhere there, but I'd like to know how it looks for others.

marmarek avatar Jul 09 '22 03:07 marmarek

Maybe there needs to be some hardware-specific quirks?

DemiMarie avatar Jul 09 '22 03:07 DemiMarie

@alt3r-3go @AlxHnr can you test any of the 5.18.x kernel-latest package? It includes the commit mentioned above, and also a follow up fix for it.

No, I don't want to. I've stopped using graphical plymonth some months ago.

AlxHnr avatar Jul 09 '22 12:07 AlxHnr

I'm installing those right now. FWIW I've been running 5.18.3 for a while and if that one includes the change in question, it didn't help, the artifacts were still there. I see the latest is 5.18.9 as of now, we'll see.

alt3r-3go avatar Jul 09 '22 13:07 alt3r-3go

Tested on 5.18.9 - the artifacts are still there. It's a bit different and seems to be slightly better (faster redraw after the artifacts make it unreadable), but only ever so slightly, as the artifacts are still there and the UI is hardly usable, especially the console with its text, which is getting swallowed by the artifacts + does not display for good several seconds until it redraws.

alt3r-3go avatar Jul 09 '22 13:07 alt3r-3go

Can you try nopat option to the dom0 Linux kernel?

marmarek avatar Jul 09 '22 13:07 marmarek

Oh, that does work! Apologies if you wanted me to test it with the option from the start :) There's still no acceleration reported by glxinfo, FWIW, but the artifacts are gone and for all intents and purposes it looks exactly like before, when that intel driver was enabled. And yes, I've checked it uses the modesetting driver now. I haven't looked in detail, so it's amusing such a small change in the way they detect options in the kernel triggers such an effect.

And to make it explicit - #7479 does not reproduce for me on this kernel (I've actually never seen that, but I skipped 5.17.x kernels).

alt3r-3go avatar Jul 09 '22 15:07 alt3r-3go

Ok, nopat is doing more or less what the commit mentioned above. So we have (at least) two types of hardware:

  • where nopat (or equivalent commit) fixes glitches
  • where nopat (or equivalent commit) causes #7479

I don't think nopat is a real solution, I think it's rather a workaround that disables something that is broken. But at least we confirmed it is the same issue (or at least very closely related) that I can reproduce locally.

marmarek avatar Jul 09 '22 15:07 marmarek

@marmarek BTW, just to make sure (and this is probably going to be useful for others facing this) - is the nopat option good for a daily driver/production machine, or using the intel XOrg driver is a better choice?

I'm not familiar with that part of the kernel and based on a description it looks like disabling PAT should not impact anything (security or performance being the top priority), but I'm not sure.

alt3r-3go avatar Jul 23 '22 14:07 alt3r-3go

Just for the record, this solved the issue on an alder lake notebook: https://github.com/QubesOS/qubes-issues/issues/7507#issuecomment-1153081021

aslfv avatar Aug 27 '22 21:08 aslfv

@marmarek, could you please comment on my above question about nopat vs the intel driver? I've been running with the former since then, but on current 6.0.2 kernel from the "stable" repo kernel-latest package that option causes a reboot loop for me (roughly - after a message that VT-d is being activated for gfx). And the artifacts are still there if I use the default driver without the option. I therefore wonder if Ishould better go back to intel driver or try to troubleshoot the boot loop.

alt3r-3go avatar Nov 10 '22 06:11 alt3r-3go

@DemiMarie, do you have any ideas w.r.t. the above, by any chance (looks like @marmarek is currently too busy or can't comment)?

alt3r-3go avatar Nov 21 '22 14:11 alt3r-3go

nopat might make things a bit slower (not sure if noticeable in practice). But it should not cause reboot. Can you collect a bit more details about the issue? Maybe add noreboot option to Xen and see if you can see the actual crash message?

marmarek avatar Nov 21 '22 14:11 marmarek

Thanks and yes, let me dig into this.

In the meanwhile I've tried running with the intel driver and that caused a GUI freeze (no reaction to mouse/kbd, no window/desktop refresh, but nothing in the logs and non-GUI processes seemingly running fine) and I reverted to the previously used 5.18 in dom0 (but 6.0.2 in VMs), due to lack of time for proper debug.

Now that I have your comment, I will check it out (probably will check out the latest 6.0.x in the testing repo before that, unless you explicitly don't recommend that).

alt3r-3go avatar Nov 21 '22 15:11 alt3r-3go

Quick note - I've recently opened #7894 but it seems to be a duplicate of this issue. I've tested the nopat boot option workaround but still get glitches. Using xorg's intel driver as a workaround isn't stable for me, I get a few random hard reboots a day.

ghost avatar Nov 23 '22 15:11 ghost

Do you see the crashes with the intel driver + kernel 5.10.112?

dmoerner avatar Nov 23 '22 15:11 dmoerner

Do you see the crashes with the intel driver + kernel 5.10.112?

I don't remember testing that specific kernel version - I'm running 5.15.76-1 now. If there's a need to test 5.10.112 I could do so, although as mentioned crashes with the intel driver seem totally random and don't happen often, so there's no guarantee that a day without crash would mean that a particular kernel version works...

ghost avatar Nov 23 '22 15:11 ghost