[WIP] feat: add tenstorrent package
Add kernel module for tenstorrent hardware.
I'm a bit stuck on this because the make build isn't working. Still figuring out what I'm missing so feedback welcome.
I think I have something wrong with my build cache. I tried adding TARGET_ARGS='--no-cache' but I still get an error about missing /tmp/build directory.
=> ERROR tenstorrent:build-0
------
> tenstorrent:build-0:
0.063 make -C /lib/modules/6.13.6-200.fc41.x86_64/build M=/tmp/build modules
0.063 make[1]: Entering directory '/tmp/build'
0.063 make[1]: Leaving directory '/tmp/build'
0.063 make[1]: *** /lib/modules/6.13.6-200.fc41.x86_64/build: No such file or directory. Stop.
0.063 make: *** [Makefile:15: modules] Error 2
I think this fc41 is something badly hardcoded? why is it Fedora Core?
I'm going to go out on a limb here: you are attempting to build on an F41 machine without having the corresponding kernel-devel package installed?
if you are on F41: rpm -qa | grep "^kernel-devel" and you'll see what you've got.
I think I have something wrong with my build cache. I tried adding
TARGET_ARGS='--no-cache'but I still get an error about missing /tmp/build directory.=> ERROR tenstorrent:build-0 ------ > tenstorrent:build-0: 0.063 make -C /lib/modules/6.13.6-200.fc41.x86_64/build M=/tmp/build modules 0.063 make[1]: Entering directory '/tmp/build' 0.063 make[1]: Leaving directory '/tmp/build' 0.063 make[1]: *** /lib/modules/6.13.6-200.fc41.x86_64/build: No such file or directory. Stop. 0.063 make: *** [Makefile:15: modules] Error 2
probably need to set KDIR https://github.com/tenstorrent/tt-kmd/blob/main/Makefile#L7
probably need to set KDIR tenstorrent/tt-kmd@
main/Makefile#L7
Is there a common KDIR we use for packages? I don't see it in other pkg configs
probably need to set KDIR tenstorrent/tt-kmd@
main/Makefile#L7Is there a common KDIR we use for packages? I don't see it in other pkg configs
should be along /rootfs/usr/lib/modules/$(cat /src/include/config/kernel.release)
The package builds properly now (I think) and it's pushed into my local registry. How do I build an ISO with with the extension for testing on a physical machine?
From what I can tell in the docs I should be able to do something like
make kernel initramfs PKG_TENSTORRENT=127.0.0.1:5005/jgarr/tenstorrent:v1.7.0-alpha.0-243-g
make imager PUSH=true IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr INSTALLER_ARCH=amd64 PLATFORM=linux/amd64
make installer PUSH=true IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr
make iso IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr
This fails on building the installer with the following error
make installer PUSH=true IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr
make[1]: Entering directory '/var/home/jgarr/src/siderolabs/talos'
v1.10.0-alpha.3-47-g8cd3c8dc7: Pulling from jgarr/imager
30fed1bc580a: Pull complete
Digest: sha256:8c757d0dc1575931f7cb3e96a63a68739c52a4e07ea31169323231c7d8c282f8
Status: Downloaded newer image for 127.0.0.1:5005/jgarr/imager:v1.10.0-alpha.3-47-g8cd3c8dc7
127.0.0.1:5005/jgarr/imager:v1.10.0-alpha.3-47-g8cd3c8dc7
skipped pulling overlay (no overlay)
profile ready:
arch: amd64
platform: metal
secureboot: false
version: v1.10.0-alpha.3-47-g8cd3c8dc7
input:
kernel:
path: /usr/install/amd64/vmlinuz
initramfs:
path: /usr/install/amd64/initramfs.xz
sdStub:
path: /usr/install/amd64/systemd-stub.efi
sdBoot:
path: /usr/install/amd64/systemd-boot.efi
baseInstaller:
imageRef: 127.0.0.1:5005/jgarr/installer-base:v1.10.0-alpha.3-47-g8cd3c8dc7
output:
kind: installer
outFormat: raw
skipped initramfs rebuild (no system extensions)
kernel command line: talos.platform=metal console=tty0 init_on_alloc=1 slab_nomerge pti=on consoleblank=0 nvme_core.io_timeout=4294967295 printk.devkmsg=on ima_template=ima-ng ima_appraise=fi
x ima_hash=sha512 selinux=1
UKI ready
◲ error pulling image 127.0.0.1:5005/jgarr/installer-base:v1.10.0-alpha.3-47-g8cd3c8dc7: GET http://127.0.0.1:5005/v2/jgarr/installer-base/manifests/v1.10.0-alpha.3-47-g8cd3c8dc7: MANIFEST_UN
KNOWN: manifest unknown; map[Tag:v1.10.0-alpha.3-47-g8cd3c8dc7]
Error: error pulling image 127.0.0.1:5005/jgarr/installer-base:v1.10.0-alpha.3-47-g8cd3c8dc7: GET http://127.0.0.1:5005/v2/jgarr/installer-base/manifests/v1.10.0-alpha.3-47-g8cd3c8dc7: MANIFE
ST_UNKNOWN: manifest unknown; map[Tag:v1.10.0-alpha.3-47-g8cd3c8dc7]
make[1]: *** [Makefile:454: image-installer] Error 1
make[1]: Leaving directory '/var/home/jgarr/src/siderolabs/talos'
make: *** [Makefile:475: installer] Error 2```
Actually, looking at the content of the container image that I built it doesn't look like the kernel modules were added to the container so I'm definitely missing something in the build process.
The package builds properly now (I think) and it's pushed into my local registry. How do I build an ISO with with the extension for testing on a physical machine?
From what I can tell in the docs I should be able to do something like
make kernel initramfs PKG_TENSTORRENT=127.0.0.1:5005/jgarr/tenstorrent:v1.7.0-alpha.0-243-g make imager PUSH=true IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr INSTALLER_ARCH=amd64 PLATFORM=linux/amd64 make installer PUSH=true IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr make iso IMAGE_REGISTRY=127.0.0.1:5005 USERNAME=jgarr
This seems to be 1.9 docs. For 1.10, need to build make installer-base before make installer.
Not sure what PKG_TENSTORRENT is supposed to mean in this context? (you have some changes you haven't show us?)
How would a new module get into this build? It should be either packaged as a system extension, or Talos source should be modified to unconditionally include it.
Either way, a PKG_KERNEL should be in the mix to make Talos use your base Linux kernel/modules, which you want to mix with your custom extension.
Does anyone know what modules.* files we actually need to include in the build? I'm not familiar with any of these files so I'm not sure which need to be included.
_out
├── etc
│ └── udev
│ └── rules.d
│ └── 50-tenstorrent.rules
└── usr
└── lib
└── modules
└── 6.12.23-talos
├── extras
│ └── tenstorrent.ko
├── modules.alias
├── modules.alias.bin
├── modules.builtin.alias.bin
├── modules.builtin.bin
├── modules.dep
├── modules.dep.bin
├── modules.devname
├── modules.softdep
├── modules.symbols
├── modules.symbols.bin
└── modules.weakdep
All the other examples I found only included modules.order, modules.builtin, and modules.builtin.modinfo
All the other examples I found only included modules.order, modules.builtin, and modules.builtin.modinfo
It doesn't really matter, as the modules database will be rebuilt when the extension is included into the final image.
I'm going to document my steps here before I forget them.
I have this pkg working and the kernel module loads but the tenstorrent card is not detected (or at least it doesn't show up in /dev/tenstorrent/* I'm not sure exactly why.
I used this branch and built my kernel and tenstorrent pkg
make kernel tenstorrent REGISTRY=127.0.0.1:5005 PUSH=true PLATFORM=linux/amd64
I wrote down the 2 images that were pushed to my local registry
Then I went to the extensions repo with https://github.com/siderolabs/extensions/pull/670 and built the extension with
make tenstorrent REGISTRY=127.0.0.1:5005 PUSH=true PLATFORM=linux/amd64 \
PKG_KERNEL=127.0.0.1:5005/jgarr/kernel:v1.7.0-alpha.0-250-g28491b7-dirty@sha256:039f24ff363517f0d49adef68b749ff2ccc43c19f587d881a7c7e65c9cfc9fb8
(the tenstorrent pkg image was added to pkg.yaml for the build)
Then I built an imager image
make imager PLATFORM=linux/amd64 INSTALLER_ARCH=amd64 PUSH=true REGISTRY=127.0.0.1:5005 \
PKG_KERNEL=127.0.0.1:5005/jgarr/kernel:v1.7.0-alpha.0-250-g28491b7-dirty@sha256:039f24ff363517f0d49adef68b749ff2ccc43c19f587d881a7c7e65c9cfc9fb8
Then I created a profile image for imager to build an installer
# profile.yaml
arch: amd64
platform: metal
secureboot: false
version: v1.10.0
input:
kernel:
path: /usr/install/amd64/vmlinuz
initramfs:
path: /usr/install/amd64/initramfs.xz
baseInstaller:
imageRef: ghcr.io/siderolabs/installer:v1.10.0
systemExtensions:
- tarballPath: /tenstorrent.tar
output:
kind: installer
outFormat: raw
And I built it with
cat profile.yaml | docker run --rm -i \
-v $PWD/_out:/out -v $PWD/tenstorrent.tar:/tenstorrent.tar \
127.0.0.1:5005/jgarr/imager:v1.10.0-alpha.3-99-gb3b20eff3@sha256:36c005ce37908245238eb6a604a6dc05a504336d88dcf83dd3bf934847572e4c -
This spit out a _out/installer-amd64.tar file which I then imported into docker and pushed to a registry
docker load -i ./_out/installer-amd64.tar
docker tag ghcr.io/siderolabs/installer:v1.10.0 rothgar/tt-installer:v1.10.0
docker push rothgar/tt-installer:v1.10.0
Then I booted talos from a generic ISO and generated the config with
talosctl gen config --install-disk /dev/nvme0n1 \
--install-image rothgar/tt-installer:v1.10.0 \
mini https://192.168.7.40:6443
And created a patch with
machine:
kernel:
modules:
- name: tenstorrent
Then I applied the install
talosctl apply -f controlplane.yaml -i -p '@tenstorrent.yaml' -n 192.168.7.40
And I was able to see the kernel module is loaded
192.168.7.40: kern: warning: [2025-05-02T21:36:23.560178617Z]: tenstorrent: loading out-of-tree module taints kernel.
192.168.7.40: kern: info: [2025-05-02T21:36:23.566441617Z]: Loading Tenstorrent AI driver module v1.33.
This is my first time trying to build a package and extension. Please let me know if I did any of these steps wrong or there is a way to do this with fewer steps.
Then I created a profile image for imager to build an installer
from here, you just need to do a make installer with PKG_KERNEL set, well first build installer-base and imager first
I have this pkg working and the kernel module loads but the tenstorrent card is not detected (or at least it doesn't show up in /dev/tenstorrent/*
Probably needs some udev rules
Probably needs some udev rules
I'm including the udev rules from their repo. https://github.com/tenstorrent/tt-kmd/blob/main/udev-50-tenstorrent.rules Have suggestions on other rules I should look at to include?
Probably needs some udev rules
I'm including the udev rules from their repo. https://github.com/tenstorrent/tt-kmd/blob/main/udev-50-tenstorrent.rules Have suggestions on other rules I should look at to include?
That should be the right one, where did you put that in the extension?
In the extension it’s in /rootfs/etc/udev/rules.d/
In the extension it’s in /rootfs/etc/udev/rules.d/
that seems correct
Looks like my problem was power related. Bought a more powerful power supply and the device shows up now and looks to be working as intended.
talosctl get extensions
NODE NAMESPACE TYPE ID VERSION NAME VERSION
192.168.4.20 runtime ExtensionStatus 0 1 tenstorrent 1.33
192.168.4.20 runtime ExtensionStatus modules.dep 1 modules.dep 6.12.25-talos
talosctl list /dev/tenstorrent
NODE NAME
192.168.4.20 .
192.168.4.20 0
They had a newer release so I'm going to bump the version in the package and test with the latest version of Talos. Is there anything else I should update before we merge it?
Looks like my problem was power related. Bought a more powerful power supply and the device shows up now and looks to be working as intended.
talosctl get extensions NODE NAMESPACE TYPE ID VERSION NAME VERSION 192.168.4.20 runtime ExtensionStatus 0 1 tenstorrent 1.33 192.168.4.20 runtime ExtensionStatus modules.dep 1 modules.dep 6.12.25-talostalosctl list /dev/tenstorrent NODE NAME 192.168.4.20 . 192.168.4.20 0They had a newer release so I'm going to bump the version in the package and test with the latest version of Talos. Is there anything else I should update before we merge it?
Cool, mostly seems good, rest can fixup once out of draft
/m
