dkms
dkms copied to clipboard
DKMS create .thinlto-cache at compiling dkms modules - but they dont get removed
Hello,
When installing a Kernel with Clang Thin LTO dkms creates a folder called .thinlto-cache
at /usr/lib/modules/$kernel/build/.thinlto
.
Actually these folder should be removed, after the compilation of the dkms modules.
When a Kernel gets upgraded or removed, the folder still stays at /usr/lib/modules
.
My complete /usr/lib/modules
contains folder from old kernels.
The .thinlto-cache folder does not get copied at the packaging, I have checked this.
The folder is created by the kernel build is it not? As such it should probably be removed by the kernel itself when we issue the make clean
command.
I don't think dkms should know details like LTO (thin or not) and removing it's artefacts.
From a very quick look this seems like a kernel bug - $(KBUILD_EXTMOD)/.thinlto-cache
is used in the clean section while $(extmod_prefix).thinlto-cache
is being used instead
cc @samitolvanen
extmod_prefix
is just KBUILD_EXTMOD
with a slash:
export extmod_prefix = $(if $(KBUILD_EXTMOD),$(KBUILD_EXTMOD)/)
However, it looks like it might not be defined when the --thinlto-cache-dir
flag is set. Perhaps you can test moving that definition before the LTO flags are set?
@ptr1337 can you try the above suggestion?
@samitolvanen moving the export(s) further up makes sense to me. Considering you're the owner/maintainer of said code, can you send a patch upstream? Thanks o/
@ptr1337 can you try the above suggestion?
@samitolvanen moving the export(s) further up makes sense to me. Considering you're the owner/maintainer of said code, can you send a patch upstream? Thanks o/
Can you send me maybe a patch which I should test?
How about this patchset and then setting the cachedir to /tmp or something like this? I will test this the coming days.
https://patchwork.kernel.org/project/linux-kbuild/patch/[email protected]/#24308209
Any news here? We provide on CachyOS a extra variant, which is built with ThinLTO. When people use dkms modules, there is over the time a massive buildcache left in /usr/lib/modules, also dkms is throwing a warning, because the directories of the kernel module is still there.
IIRC the issue was that the kernel Makefiles are not properly setting the respective variables. The linked patch may sidetrack that by storing the files outside of /usr/lib/modules
Doubt I'll be working on this, but here are some general ideas:
- compare the file listings at least /usr/lib/modules and /var across a) clean system b) dkms add/build/install and c) dkms remove
- tweak the make invocations - use
make V=1 ...
ormake V=12 ...
and check the log
Lets get this issue fixed asap rocky. It was a blocker for me cause an old dkms module on an old kernel version was not getting deleted. Caused issues installing a newer version of that program.
@LethalManBoob I admire your enthusiasm.
If you can reproduce the issue and can provide some feedback to my previous post https://github.com/dell/dkms/issues/292#issuecomment-1882841116 that would be great - @ptr1337 in case you've missed it.
Alternatively, a clear reproducer would be appreciated. One which includes:
- what's the OS used to trigger this
- the kernel version and source if not kernel.org
- version of clang andorigin (distro one, binaries from llvm, other)
- .config file and kernel build command/recipe
All and all, it sounds more like general kernel debugging/hacking ;-)
@LethalManBoob I admire your enthusiasm.
If you can reproduce the issue and can provide some feedback to my previous post #292 (comment) that would be great - @ptr1337 in case you've missed it.
Alternatively, a clear reproducer would be appreciated. One which includes:
- what's the OS used to trigger this
- the kernel version and source if not kernel.org
- version of clang andorigin (distro one, binaries from llvm, other)
- .config file and kernel build command/recipe
All and all, it sounds more like general kernel debugging/hacking ;-)
@evelikov
This can be reproduced with following: Distro: Archlinux (or archlinux based) Kernel: Any 6.x Kernel, (I dont know if it was already present previously) Clang Version: Archlinux clang (Is reproduceable with 15, 16, 17 and 18) Config File: Just archlinux default config, but kernel compiled with ThinLTO
If you have a kernel installed with ThinLTO, simply install any dkms module (like nvidia-dkms for example), then either remove the kernel or upgrade it --> Leftovers in /usr/lib/modules/$kernelversion+name
but kernel compiled with ThinLTO
@ptr1337 can you provide a PKGBUILD that achieves this?
@evelikov
https://github.com/CachyOS/linux-cachyos/blob/master/linux-cachyos/PKGBUILD#L128
Change here the value to "thin" and then build the kernel. You can also fetch prebuilt kernels here: https://mirror.cachyos.org/repo/x86_64_v3/cachyos-v3/linux-cachyos-lto-6.8.9-4-x86_64_v3.pkg.tar.zst https://mirror.cachyos.org/repo/x86_64_v3/cachyos-v3/linux-cachyos-lto-headers-6.8.9-4-x86_64_v3.pkg.tar.zst
Be aware, that the prebuilt ones are only supported by x86-64-v3 supported CPUs and you might want to pull in the keyring, before installing the package.
This indeed is a kernel bug.
--thinlto-cache-dir
was set to $(extmod_prefix).thinlto-cache
at line 945
https://github.com/torvalds/linux/blob/45db3ab70092637967967bfd8e6144017638563c/Makefile#L945
But extmod_prefix
was assigned at line 1095
https://github.com/torvalds/linux/blob/45db3ab70092637967967bfd8e6144017638563c/Makefile#L1095
Therefore, the .thinlto-cache
directory is always created in the working directory.
And it won't be deleted when executing make clean
, This is because in the clean:
target, the path was written as $(KBUILD_EXTMOD)/.thinlto-cache
https://github.com/torvalds/linux/blob/45db3ab70092637967967bfd8e6144017638563c/Makefile#L1785-L1786
By moving the assignment of extmod_prefix
before line 945, this bug can be resolved.
I have opened a bug report https://bugzilla.kernel.org/show_bug.cgi?id=218825
The ThinLTO caching was removed in linux-next https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=aba091547ef6159d52471f42a3ef531b7b660ed8
The ThinLTO caching was removed in linux-next https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=aba091547ef6159d52471f42a3ef531b7b660ed8
Thanks, issue has been fixed after cherry picking the patch into the kernel. We can close this issue, if its upstream merged.
Let's hope that when/if thin-lto gets introduced the bug won't be reintroduced again :crossed_fingers:
Closing issue
Thanks for finding this issue and im really glad that this has been fixed. As distribution deploying thinlto kernels was a really bad experience for many users.