alexanderwerning

Results 5 comments of alexanderwerning

I had the same issue, `bare_metal_version` is only defined in the `setup.py` if the CUDA_HOME environment variable is set, but the 11.8 version check is outside of that if block:...

It depends on the location and version of your cuda installation, for me it was this: `export CUDA_HOME=/usr/local/cuda-11.8`

I encountered a similar issue with `--ntasks-per-node`. for `devices=4`, pytorch-lightning complains that there are not enough gpus (only one is visible per task), for `devices=1`, the mismatch between `devices` and...

Yes, I solved it by adding `export CUDA_VISIBLE_DEVICES=0,1,2,3` to the slurm script, explicitly making all gpus visible to every process. This is necessary since slurm automatically sets unique CUDA_VISIBLE_DEVICES per...