Foldingathome ROCM GPU Support
Describe the bug
https://foldingforum.org/viewtopic.php?p=358294 Essentially this bug. Work Units download endlessly and cannot be completed due to crashing.
Steps To Reproduce
Steps to reproduce the behavior:
- gpu with rocm support
- enable foldingathome and rocmPackages.clr.icd
- run foldingathome
Expected behavior
Work units would be executed and completed on gpu.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Log: https://r2.seanbehan.ca/a083b097-4abb-482e-909f-204df82a6d81
Notify maintainers
@sergv
Metadata
Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.
[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
- system: `"x86_64-linux"`
- host os: `Linux 6.8.6, NixOS, 24.05 (Uakari), 24.05.20240417.edd8117`
- multi-user?: `yes`
- sandbox: `yes`
- version: `nix-env (Nix) 2.18.2`
- channels(root): `"nixos"`
- nixpkgs: `/nix/store/18p0lvi8gzlcj0nwnm6rhaqza5kg3g1g-source`
Add a :+1: reaction to issues you find important.
@codebam Could you please try playing with extraPkgs argument of the folding at home nix package (defined at https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/science/misc/foldingathome/client.nix#L13C3-L13C12) to see whether explicitly adding libstdc++ in there solves the problem? If yes then it could be added to FHS packages. Sadly I don't have AMD-capable OpenCL so cannot test the suggestion myself.
Notifying real maintainer in the meantime @zimbatm.
Not sure how to do that sorry. What package would provide libstdc++? I asked on Matrix and was told it should already be part of stdenv.
I don't really know which package provides C++ standard library. One possibility seems to be the LLVM: llvmPackages_17.libcxx with the default being libcxx.
If it's already in stdenv then it seems like it could be accessed via pkgs.stdenv.cc.cc.lib according to https://discourse.nixos.org/t/how-to-solve-libstdc-not-found-in-shell-nix/25458/15.
Overall I cannot really suggest a working way because I cannot test my suggestions so you'll need to find a way.
Oh I just realized extraPkgs isn't available. Thank you anyways for trying to help
I am encountering the same error as codebam.
Could you please try playing with
extraPkgsargument of the folding at home nix package (defined at https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/science/misc/foldingathome/client.nix#L13C3-L13C12) to see whether explicitly addinglibstdc++in there solves the problem? If yes then it could be added to FHS packages. Sadly I don't have AMD-capable OpenCL so cannot test the suggestion myself.
I have been unable to get this to work. I tried just adding to the package's extraPkgs. I also tried explicitly adding the library directly to each of the following inputs in pkgs/applications/science/misc/foldingathome/client.nix:
- fah-client's
nativeBuildInputs - fah-client's
runtimeInputs - fah-client's
buildInputs - main package's
targetPkgs
Regardless of whether I used plain libcxx or rocmPackages.llvm.libcxx, I still got the original error:
opencl-device was set but OpenCL platform could not be found. ERROR:126: Neither CUDA nor OpenCL is available.
In case it helps, here are the relevant parts of my NixOS config:
{
boot.initrd.kernelModules = ["amdgpu"];
boot.kernelModules = ["kvm-amd"];
environment.systemPackages = with pkgs; [
radeontop
];
hardware.opengl = {
enable = true;
driSupport = true;
driSupport32Bit = true;
extraPackages = with pkgs; [
amdvlk
clinfo
rocmPackages.clr.icd
rocmPackages.rocminfo
rocmPackages.rocm-runtime
];
extraPackages32 = with pkgs.driversi686Linux; [
amdvlk
];
setLdLibraryPath = true;
};
services.foldingathome.enable = true;
# Heterogeneous-computing Interface for Portability (HIP)
# https://rocm.docs.amd.com/projects/HIP/en/latest/index.html
systemd.tmpfiles.rules = let
rocmEnv = pkgs.symlinkJoin {
name = "rocm-combined";
paths = with pkgs.rocmPackages; [
clr
hipblas
rocblas
];
};
in [
"L+ /opt/rcom - - - - ${rocmEnv}"
];
}
Can you share one of the binaries that doesn't execute? Most likely the binary ELF headers are looking for those libraries in traditional paths.
If that's the case then we have a problem. Binaries are getting downloaded by the folding client and executed directly. There isn't a good way to patchelf those (unless somebody wants to work on a source patch).
One workaround for that is to also set programs.nix-ld.enable, and then put the missing libraries in programs.nix-ld.libraries
One workaround for that is to also set programs.nix-ld.enable, and then put the missing libraries in programs.nix-ld.libraries
Thanks for the tip! Unfortunately, I haven't been able to get past the error even with these changes in place. I tried a couple combinations of packages based on my incredibly limited understanding of ROCm, then tried just shoving all of the potential packages into nix-ld.libraries. Here's my relevant config with the "just include every potential package" approach:
{
pkgs,
...
}: {
boot.initrd.kernelModules = ["amdgpu"];
boot.kernelModules = ["kvm-amd"];
environment.systemPackages = with pkgs; [
radeontop
];
hardware.opengl = {
enable = true;
driSupport = true;
driSupport32Bit = true;
extraPackages = with pkgs; [
amdvlk
clinfo
libcxx
rocmPackages.clr.icd
rocmPackages.hipblas
rocmPackages.rocblas
rocmPackages.rocm-runtime
rocmPackages.rocminfo
stdenv.cc.cc
];
extraPackages32 = with pkgs.driversi686Linux; [
amdvlk
];
setLdLibraryPath = true;
};
programs.nix-ld.enable = true;
programs.nix-ld.libraries = with pkgs; [
amdvlk
clinfo
libcxx
rocmPackages.clr
rocmPackages.clr.icd
rocmPackages.hipblas
rocmPackages.rocblas
rocmPackages.rocm-runtime
rocmPackages.rocminfo
stdenv.cc.cc
];
services.foldingathome.enable = true;
# Heterogeneous-computing Interface for Portability (HIP)
# https://rocm.docs.amd.com/projects/HIP/en/latest/index.html
systemd.tmpfiles.rules = let
rocmEnv = pkgs.symlinkJoin {
name = "rocm-combined";
paths = with pkgs.rocmPackages; [
clr
hipblas
rocblas
rocm-runtime
];
};
in [
"L+ /opt/rcom - - - - ${rocmEnv}"
];
}
Can you share one of the binaries that doesn't execute? Most likely the binary ELF headers are looking for those libraries in traditional paths.
I'll try to grab one sometime this weekend.
@lafrenierejm any luck with this? I see you currently have a good chunk of what you posted above in your public config.
any luck with this? I see you currently have a good chunk of what you posted above in your public config.
@Joseph-DiGiovanni Unfortunately, no. I also haven't spent any time investigating this recently, so it's possible my configuration and/or the info I provided in this thread is out of date.
Is there any update on this? I've also been trying...
I can make this work with the following (taken from here), but only when I run fah-client directly. The systemd service seems to be broken but it's also broken on a different device with an Nvidia GPU, so I think that problem is unrelated.
hardware.amdgpu.opencl.enable = true;
systemd.tmpfiles.rules = ["L+ /opt/rocm/hip - - - - ${pkgs.rocmPackages.clr}"];
environment.variables.OCL_ICD_VENDORS = "${pkgs.rocmPackages.clr.icd}/etc/OpenCL/vendors/";
Could/should this be included in hardware.amdgpu.opencl.enable?