LoopTest icon indicating copy to clipboard operation
LoopTest copied to clipboard

ImportError: torch_extensions/fused/fused.so: undefined symbol:

Open drscotthawley opened this issue 1 year ago • 1 comments

Dear Allen, I know it's been a few years so your code is not being actively maintained. Nevertheless, I still want to train the model on my own data!

After processing my dataset, I try to run the training script, and get the error:

ImportError: /home/myusername/.cache/torch_extensions/fused/fused.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv

This seems to be a StyleGAN2 issue common to other users, but I'm not sure how to resolve it. At first the conda environment wasn't setting up CUDA_HOME properly, so I installed cudatoolkit-dev, but that only comes in version CUDA 10.1, and you were using CUDA 10.2... So....I tried downgrading pytorch to 10.1 as well, but.. that had no effect on getting rid of the error.

In case you have any interest in seeing if you code still runs, please let me know if you have any suggestions! Thanks.

drscotthawley avatar Dec 07 '23 03:12 drscotthawley

Ahh, apparently, unlike the newer pytorch installations via pip, the older conda installations of pytorch did not install all the CUDA you needed; they expected you to already have a (system-wide) CUDA installation.

Unfortunately, CUDA 10.2 is not available for my OS anymore: I am running Ubuntu 22.04, and CUDA 10.2 support only goes up to Ubuntu 18.04.

I'm wary of trying this with a newer version of pytorch or a CUDA installation intended for an older OS. Since probably a lot has changed in either case. I'd welcome any suggestions!

drscotthawley avatar Dec 07 '23 03:12 drscotthawley