pkgs icon indicating copy to clipboard operation
pkgs copied to clipboard

(WIP)feat: add l4t package

Open mmalyska opened this issue 10 months ago • 9 comments

Add support for Jetson Orin SBC. Those are modules and display drivers for tegra chips.

Problem:

I don't see module host1x to be loaded from the extension. image The module host1x is already build in inside kernel image so it cannot be replaced by extension module with the same name host1x.

How to run it

Building Talos

Create custom builder buildx

docker buildx create --driver docker-container  --driver-opt network=host --name local1 --buildkitd-flags '--allow-insecure-entitlement security.insecure' --use

Run local doker registry + ui

crane export 127.0.0.1:5005/siderolabs/nvidia-l4t:36.4.3-1.9.5-beta extension-nvidia-l4t
cat profile.yaml | docker run --network host --rm -i \
-v $PWD/_out:/out -v $PWD/extension-nvidia-l4t:/extension-nvidia-l4t \
127.0.0.1:5005/siderolabs/imager:1.9.5-beta -
docker login ghcr.io -u mmalyska --password-stdin
docker load < _out/installer-arm64.tar
docker tag 127.0.0.1:5005/siderolabs/installer:1.9.5-beta ghcr.io/mmalyska/talos-images:v1.9.5-beta2
docker push ghcr.io/mmalyska/talos-images:v1.9.5-beta2

Building the pkgs

cd siderolabs-pkgs
make base PLATFORM=linux/arm64,linux/amd64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/base:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/base:cache" IMAGE_TAG=1.9.5-beta
make kernel PLATFORM=linux/arm64,linux/amd64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/kernel:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/kernel:cache" IMAGE_TAG=1.9.5-beta
make nvidia-l4t-pkg PLATFORM=linux/arm64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-pkg:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-pkg:cache" IMAGE_TAG=1.9.5-beta

Building extensions

cd siderolabs-extensions
make nvidia-l4t PLATFORM=linux/arm64 REGISTRY=127.0.0.1:5005 PUSH=true CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-extension:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/nvidia-l4t-extension:cache" PKGS_PREFIX=127.0.0.1:5005/siderolabs PKGS=1.9.5-beta IMAGE_TAG=1.9.5-beta

Building talos image

cd talos
git fetch origin 88fc6bbebeff1c0db0e43fb0a83d2b03a973da8a
git checkout -b 1.9.5-beta 88fc6bbebeff1c0db0e43fb0a83d2b03a973da8a
## needs both arm and amd as it will be run locally
make imager INSTALLER_ARCH=all PLATFORM=linux/amd64 PUSH=true REGISTRY=127.0.0.1:5005 IMAGE_TAG=1.9.5-beta PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:1.9.5-beta CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/imager:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/imager:cache"
make installer-base PLATFORM=linux/amd64,linux/arm64 PUSH=true REGISTRY=127.0.0.1:5005 IMAGE_TAG=1.9.5-beta PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:1.9.5-beta CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/installer-base:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/installer-base:cache"
make installer PLATFORM=linux/amd64,linux/arm64 PUSH=true REGISTRY=127.0.0.1:5005 IMAGE_TAG=1.9.5-beta PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:1.9.5-beta CI_ARGS="--cache-to=mode=max,type=registry,ref=127.0.0.1:5005/siderolabs/installer:cache --cache-from=type=registry,ref=127.0.0.1:5005/siderolabs/installer:cache"

Create profile.yaml file

# profile.yaml
arch: arm64
platform: metal
secureboot: false
version: v1.9.5-beta
input:
  kernel:
    path: /usr/install/arm64/vmlinuz
  initramfs:
    path: /usr/install/arm64/initramfs.xz
  baseInstaller:
    imageRef: 127.0.0.1:5005/siderolabs/installer:1.9.5-beta
  systemExtensions:
    - tarballPath: extension-nvidia-l4t
output:
  kind: installer
  outFormat: raw

Create installer image:

crane export 127.0.0.1:5005/siderolabs/nvidia-l4t:36.4.3-1.9.5-beta extension-nvidia-l4t
cat profile.yaml | docker run --network host --rm -i \
-v $PWD/_out:/out -v $PWD/extension-nvidia-l4t:/extension-nvidia-l4t \
127.0.0.1:5005/siderolabs/imager:1.9.5-beta -
docker login ghcr.io -u mmalyska --password-stdin
docker load < _out/installer-arm64.tar
docker tag 127.0.0.1:5005/siderolabs/installer:1.9.5-beta ghcr.io/mmalyska/talos-images:v1.9.5-beta2
docker push ghcr.io/mmalyska/talos-images:v1.9.5-beta2

List do changes in node config:

  install:
    disk: /dev/nvme0n1
    extraKernelArgs:
      - console=tty0
      - console=ttyS0,115200
      - sysctl.kernel.kexec_load_disabled=1
      - talos.dashboard.disabled=1
    image: ghcr.io/mmalyska/talos-images:v1.9.5-beta
    wipe: false
  kernel:
    modules: # list from modules.dep of extension nvidia-l4t
      - name: host1x
      - name: nvmap
      - name: nvsciipc
      - name: mc-utils
      - name: nvgpu
      - name: nvhwpm
      - name: tegra-dce
      - name: tsecriscv
      - name: host1x-nvhost
      - name: nvidia-drm
      - name: nvidia-modeset
      - name: nvidia

Apply changes and install new image:

talosctl apply-config --nodes 192.168.48.5 --file provision/talos/clusterconfig/home-nv1.yaml
talosctl upgrade --nodes 192.168.48.5 --image ghcr.io/mmalyska/talos-images:v1.9.5-beta
talosctl --nodes 192.168.48.5 health --wait-timeout=10m --server=false

mmalyska avatar Feb 23 '25 22:02 mmalyska

Hey @mmalyska thanks for this PR. Is this something you're still working on? It looks useful for other users but we don't have hardware to test or verify functionality.

rothgar avatar Apr 14 '25 23:04 rothgar

Hi @rothgar I'm still working on it(not that intensively as I don't have arm architecture to build the stuff efficiently, Just building kernel takes me ~8h on my Windows machine).

Right now I'm thinking and looking for the solution as I'm stuck that GPU module needs changes inside kernel(I really don't want to rewrite GPU drivers to use as it is kernel module from 6.12). Normally what I would do is just replace it with extension, but the module host1x is build in inside kernel so I cannot replace it with extension. So I want to focus on write-up what I've done so far and post it on community so maybe others would have better idea how to solve it. The best solution would be for nvidia to release drivers compatible with 6.12 and up Linux kernels.

mmalyska avatar Apr 15 '25 07:04 mmalyska

@rothgar I just update PR description with steps to reproduce and the error I'm facing. If you need more info just reach me out :)

mmalyska avatar Apr 15 '25 08:04 mmalyska

This PR is stale because it has been open 45 days with no activity.

github-actions[bot] avatar Jul 16 '25 02:07 github-actions[bot]

Sorry, I completely missed the problem in the description. I think you extension filesystem may be incorrect. I think a system extension would merge with the base filesystem and overlay your module file (replacing the existing one).

Maybe @frezbo can correct me on how that should work.

rothgar avatar Jul 16 '25 21:07 rothgar

Sorry, I completely missed the problem in the description. I think you extension filesystem may be incorrect. I think a system extension would merge with the base filesystem and overlay your module file (replacing the existing one).

Maybe @frezbo can correct me on how that should work.

lt4 needs to be a pkg first to ship the modules

frezbo avatar Jul 17 '25 02:07 frezbo

lt4 needs to be a pkg first to ship the modules

This PR is trying to add a l4t pkg and https://github.com/siderolabs/extensions/pull/624 is adding the extension.

@mmalyska Is the reason you're trying to replace host1x.ko because it's already provided by the base OS image and you need an updated one?

rothgar avatar Jul 17 '25 18:07 rothgar

lt4 needs to be a pkg first to ship the modules

This PR is trying to add a l4t pkg and siderolabs/extensions#624 is adding the extension.

@mmalyska Is the reason you're trying to replace host1x.ko because it's already provided by the base OS image and you need an updated one?

Hi, I need to provide host1x(I hope this is the only one that needs replacement) build by package as it includes changes from patch that are needed by the nvidia driver for jetson-orin board. Those changes are not in the upstream but provided by nvidia developers.

mmalyska avatar Jul 17 '25 20:07 mmalyska

This PR is stale because it has been open 45 days with no activity.

github-actions[bot] avatar Sep 01 '25 02:09 github-actions[bot]