nix-gl-host icon indicating copy to clipboard operation
nix-gl-host copied to clipboard

CUDA on NVidia Jetsons

Open SomeoneSerge opened this issue 1 year ago • 0 comments
trafficstars

Is your feature request related to a problem? Please describe.

I'd like to have an out-of-the box just-nix run experience on NVidia Jetsons. The Jetson situation is as follows:

  • They come with a fairly old kernel, their libcuda.so doesn't support any recent cuda toolkit releases.
  • Cudatoolkit includes (for chosen platforms) a special "compatibility driver", cudaPackages.cuda_compat in Nixpkgs, which is effectively the user-space driver from a newer nvidia driver/kernel module release (something of the sort) that works with the old kernel but also supports an up-to-date cudatoolkit.
  • This means that loading the host system's libcuda.so (/usr/lib/aarch64-linux/libcuda.so) is generally speaking the wrong thing to do on a jetson.
  • The compatibility driver has a number of dependencies we don't know how to satisfy, other than take them from the host system: thelibnvrum_{mem,gpu}.so, which in turn depend on half a dozen other libraries. Note that some of these link to libstdc++. We don't know anything about the compatibility guarantees of these libraries, e.g. whether they'd work if we (ignoring the legal issues) packaged them in Nixpkgs and linked them directly, irrespective of the kernel module we'd be interacting with at runtime.
  • This is not an issue for Jetpack-NixOS users because they can just make their NixOS deploy the compat driver: https://github.com/anduril/jetpack-nixos/pull/160.
  • It is an issue for the consumers of the vendor-supplied jetpack Ubuntu

Describe the solution you'd like

Somehow make nixglhost

  • [ ] always prefer the compat driver,
    • [ ] including the one directly linked into nixpkgs' packages through their DT_RUNPATHs;
  • [ ] and handle cuda_compat's impure dependencies;
  • [ ] but make sure we don't load the vendored FHS libraries when they've got counterparts in Nixpkgs (libc, libstdc++)

I think the solution would be to have nix-gl-host not expose libcuda in LD_LIBRARY_PATH, but only expose its dependencies (libnvrm*)

Describe alternatives you've considered

Merge the LD_FALLBACK_PATH PR (https://github.com/NixOS/nixpkgs/pull/248547) into Nixpkgs, make nix-gl-host use that instead of LD_LIBRARY_PATH.

Additional context

SomeoneSerge avatar Dec 19 '23 18:12 SomeoneSerge