(WIP)feat: add l4t package
Add support for Jetson Orin SBC. Those are modules and display drivers for tegra chips.
Problem:
I don't see module host1x to be loaded from the extension.
The module host1x is already build in inside kernel
so it cannot be replaced by extension module with the same name host1x.
How to run it
Building Talos
Create custom builder buildx
docker buildx create --driver docker-container --driver-opt network=host --name local1 --buildkitd-flags '--allow-insecure-entitlement security.insecure' --use
Run local doker registry + ui
crane export 127.0.0.1:5005/siderolabs/nvidia-l4t:36.4.3-1.9.5-beta extension-nvidia-l4t
cat profile.yaml | docker run --network host --rm -i \
-v $PWD/_out:/out -v $PWD/extension-nvidia-l4t:/extension-nvidia-l4t \
127.0.0.1:5005/siderolabs/imager:1.9.5-beta -
docker login ghcr.io -u mmalyska --password-stdin
docker load < _out/installer-arm64.tar
docker tag 127.0.0.1:5005/siderolabs/installer:1.9.5-beta ghcr.io/mmalyska/talos-images:v1.9.5-beta2
docker push ghcr.io/mmalyska/talos-images:v1.9.5-beta2
Building the pkgs
cd siderolabs-pkgs
make base PLATFORM=linux/arm64,linux/amd64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/base:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/base:cache" IMAGE_TAG=1.9.5-beta
make kernel PLATFORM=linux/arm64,linux/amd64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/kernel:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/kernel:cache" IMAGE_TAG=1.9.5-beta
make nvidia-l4t-pkg PLATFORM=linux/arm64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-pkg:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-pkg:cache" IMAGE_TAG=1.9.5-beta
Building extensions
cd siderolabs-extensions
make nvidia-l4t PLATFORM=linux/arm64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-extension:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-extension:cache" PKGS_PREFIX=127.0.0.1:5005/siderolabs PKGS=1.9.5-beta IMAGE_TAG=1.9.5-beta
Building talos image
cd talos
git fetch origin 88fc6bbebeff1c0db0e43fb0a83d2b03a973da8a
git checkout -b 1.9.5-beta 88fc6bbebeff1c0db0e43fb0a83d2b03a973da8a
## needs both arm and amd as it will be run locally
make imager INSTALLER_ARCH=all PLATFORM=linux/amd64 PUSH=true REGISTRY=127.0.0.1:5005 IMAGE_TAG=1.9.5-beta PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:1.9.5-beta CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/imager:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/imager:cache"
make installer-base PLATFORM=linux/amd64,linux/arm64 PUSH=true REGISTRY=127.0.0.1:5005 IMAGE_TAG=1.9.5-beta PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:1.9.5-beta CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/installer-base:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/installer-base:cache"
make installer PLATFORM=linux/amd64,linux/arm64 PUSH=true REGISTRY=127.0.0.1:5005 IMAGE_TAG=1.9.5-beta PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:1.9.5-beta CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/installer:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/installer:cache"
Create profile.yaml file
# profile.yaml
arch: arm64
platform: metal
secureboot: false
version: v1.9.5-beta
input:
kernel:
path: /usr/install/arm64/vmlinuz
initramfs:
path: /usr/install/arm64/initramfs.xz
baseInstaller:
imageRef: 127.0.0.1:5005/siderolabs/installer:1.9.5-beta
systemExtensions:
- tarballPath: extension-nvidia-l4t
output:
kind: installer
outFormat: raw
Create installer image:
crane export 127.0.0.1:5005/siderolabs/nvidia-l4t:36.4.3-1.9.5-beta extension-nvidia-l4t
cat profile.yaml | docker run --network host --rm -i \
-v $PWD/_out:/out -v $PWD/extension-nvidia-l4t:/extension-nvidia-l4t \
127.0.0.1:5005/siderolabs/imager:1.9.5-beta -
docker login ghcr.io -u mmalyska --password-stdin
docker load < _out/installer-arm64.tar
docker tag 127.0.0.1:5005/siderolabs/installer:1.9.5-beta ghcr.io/mmalyska/talos-images:v1.9.5-beta2
docker push ghcr.io/mmalyska/talos-images:v1.9.5-beta2
List do changes in node config:
install:
disk: /dev/nvme0n1
extraKernelArgs:
- console=tty0
- console=ttyS0,115200
- sysctl.kernel.kexec_load_disabled=1
- talos.dashboard.disabled=1
image: ghcr.io/mmalyska/talos-images:v1.9.5-beta
wipe: false
kernel:
modules: # list from modules.dep of extension nvidia-l4t
- name: host1x
- name: nvmap
- name: nvsciipc
- name: mc-utils
- name: nvgpu
- name: nvhwpm
- name: tegra-dce
- name: tsecriscv
- name: host1x-nvhost
- name: nvidia-drm
- name: nvidia-modeset
- name: nvidia
Apply changes and install new image:
talosctl apply-config --nodes 192.168.48.5 --file provision/talos/clusterconfig/home-nv1.yaml
talosctl upgrade --nodes 192.168.48.5 --image ghcr.io/mmalyska/talos-images:v1.9.5-beta
talosctl --nodes 192.168.48.5 health --wait-timeout=10m --server=false
Hey @mmalyska thanks for this PR. Is this something you're still working on? It looks useful for other users but we don't have hardware to test or verify functionality.
Hi @rothgar I'm still working on it(not that intensively as I don't have arm architecture to build the stuff efficiently, Just building kernel takes me ~8h on my Windows machine).
Right now I'm thinking and looking for the solution as I'm stuck that GPU module needs changes inside kernel(I really don't want to rewrite GPU drivers to use as it is kernel module from 6.12). Normally what I would do is just replace it with extension, but the module host1x is build in inside kernel so I cannot replace it with extension.
So I want to focus on write-up what I've done so far and post it on community so maybe others would have better idea how to solve it.
The best solution would be for nvidia to release drivers compatible with 6.12 and up Linux kernels.
@rothgar I just update PR description with steps to reproduce and the error I'm facing. If you need more info just reach me out :)
This PR is stale because it has been open 45 days with no activity.
Sorry, I completely missed the problem in the description. I think you extension filesystem may be incorrect. I think a system extension would merge with the base filesystem and overlay your module file (replacing the existing one).
Maybe @frezbo can correct me on how that should work.
Sorry, I completely missed the problem in the description. I think you extension filesystem may be incorrect. I think a system extension would merge with the base filesystem and overlay your module file (replacing the existing one).
Maybe @frezbo can correct me on how that should work.
lt4 needs to be a pkg first to ship the modules
lt4 needs to be a pkg first to ship the modules
This PR is trying to add a l4t pkg and https://github.com/siderolabs/extensions/pull/624 is adding the extension.
@mmalyska Is the reason you're trying to replace host1x.ko because it's already provided by the base OS image and you need an updated one?
lt4 needs to be a pkg first to ship the modules
This PR is trying to add a l4t pkg and siderolabs/extensions#624 is adding the extension.
@mmalyska Is the reason you're trying to replace host1x.ko because it's already provided by the base OS image and you need an updated one?
Hi, I need to provide host1x(I hope this is the only one that needs replacement) build by package as it includes changes from patch that are needed by the nvidia driver for jetson-orin board. Those changes are not in the upstream but provided by nvidia developers.
This PR is stale because it has been open 45 days with no activity.