RFdiffusion
RFdiffusion copied to clipboard
NVTX missing using SE3nv.yml -- Pytorch 2.0 solution
Device
OS: CentOS Linux 7 GPU: gtx 1080
Issue
Hi! I get the following error running any of the examples scripts
RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?
When using the current SE3nv.yml I get the following versions
pytorch 1.9.1 cpu_py39hc5866cc_3 conda-forge
torchaudio 0.9.1 py39 pytorch
torchvision 0.14.1 cpu_py39h39206e8_1 conda-forge
Solution
I did a clean install running pip3 install --force-reinstall torch torchvision torchaudio
torch 2.0.0 pypi_0 pypi
torchaudio 2.0.1 pypi_0 pypi
torchvision 0.15.1 pypi_0 pypi
That seems to run every example without an issue. I've come into issues before with conda installs for pytorch when not using the most recent version. Is there a known issue from keeping RFdiffusion from moving to pytorch 2.0?
I also ran across this issue and your solution seems to works (at least on the examples I've tried). Thanks!
I got it too, and for me this worked:
conda (or mamba) update --all -c pytorch
OS: Fedora36 GPU: gtx 1080Ti
I have CUDA 11.8, but your solution worked after I modified the SE3nv.yml
to have:
- cudatoolkit=11.7
- dgl-cuda11.7
Note, I had to accept one version lower on the installed toolkit because there is currently no dgl-cuda11.8
The solution: $ pip3 install --force-reinstall torch torchvision torchaudio worked for me too on RTX4090 on Ubuntu 22.04 I was getting slightly different error though:
Traceback (most recent call last): File "/big18TB/apps/RF/RFdiffusion/./scripts/run_inference.py", line 94, in main px0, x_t, seq_t, plddt = sampler.sample_step( File "/big18TB/apps/RF/RFdiffusion/rfdiffusion/inference/model_runners.py", line 664, in sample_step msa_prev, pair_prev, px0, state_prev, alpha, logits, plddt = self.model(msa_masked, File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/big18TB/apps/RF/RFdiffusion/rfdiffusion/RoseTTAFoldModel.py", line 102, in forward msa, pair, R, T, alpha_s, state = self.simulator(seq, msa_latent, msa_full, pair, xyz[:,:,:3], File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/big18TB/apps/RF/RFdiffusion/rfdiffusion/Track_module.py", line 420, in forward msa_full, pair, R_in, T_in, state, alpha = self.extra_block[i_m](msa_full, File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/big18TB/apps/RF/RFdiffusion/rfdiffusion/Track_module.py", line 332, in forward R, T, state, alpha = self.str2str(msa, pair, R_in, T_in, xyz, state, idx, motif_mask=motif_mask, top_k=0) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/cuda/amp/autocast_mode.py", line 141, in decorate_autocast return func(args, **kwargs) File "/big18TB/apps/RF/RFdiffusion/rfdiffusion/Track_module.py", line 266, in forward shift = self.se3(G, node.reshape(BL, -1, 1), l1_feats, edge_feats) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/big18TB/apps/RF/RFdiffusion/rfdiffusion/SE3_network.py", line 83, in forward return self.se3(G, node_features, edge_features) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/transformer.py", line 140, in forward basis = basis or get_basis(graph.edata['rel_pos'], max_degree=self.max_degree, compute_gradients=False, File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/basis.py", line 167, in get_basis spherical_harmonics = get_spherical_harmonics(relative_pos, max_degree) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/basis.py", line 58, in get_spherical_harmonics sh = o3.spherical_harmonics(all_degrees, relative_pos, normalize=True) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/e3nn/o3/_spherical_harmonics.py", line 180, in spherical_harmonics return sh(x) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/bulat/anaconda3/envs/SE3nv/lib/python3.9/site-packages/e3nn/o3/_spherical_harmonics.py", line 82, in forward sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2]) RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch)
nvrtc compilation failed:
#define NAN __int_as_float(0x7fffffff) #define POS_INFINITY __int_as_float(0x7f800000) #define NEG_INFINITY __int_as_float(0xff800000)
template<typename T> device T maximum(T a, T b) { return isnan(a) ? a : (a > b ? a : b); }
template<typename T> device T minimum(T a, T b) { return isnan(a) ? a : (a < b ? a : b); }
extern "C" global void fused_pow_pow_pow_su_9196483836509741110(float* tz_1, float* ty_1, float* tx_1, float* aten_mul, float* aten_mul_1, float* aten_mul_2, float* aten_sub, float* aten_add, float* aten_mul_3, float* aten_pow) { { if (512 * blockIdx.x + threadIdx.x<22350 ? 1 : 0) { float ty_1_1 = __ldg(ty_1 + 3 * (512 * blockIdx.x + threadIdx.x)); aten_pow[512 * blockIdx.x + threadIdx.x] = ty_1_1 * ty_1_1; float tz_1_1 = __ldg(tz_1 + 3 * (512 * blockIdx.x + threadIdx.x)); float tx_1_1 = __ldg(tx_1 + 3 * (512 * blockIdx.x + threadIdx.x)); aten_mul_3[512 * blockIdx.x + threadIdx.x] = (float)((double)(tz_1_1 * tz_1_1 - tx_1_1 * tx_1_1) * 0.8660254037844386); aten_add[512 * blockIdx.x + threadIdx.x] = tx_1_1 * tx_1_1 + tz_1_1 * tz_1_1; aten_sub[512 * blockIdx.x + threadIdx.x] = ty_1_1 * ty_1_1 - (float)((double)(tx_1_1 * tx_1_1 + tz_1_1 * tz_1_1) * 0.5); aten_mul_2[512 * blockIdx.x + threadIdx.x] = (float)((double)(ty_1_1) * 1.732050807568877) * tz_1_1; aten_mul_1[512 * blockIdx.x + threadIdx.x] = (float)((double)(tx_1_1) * 1.732050807568877) * ty_1_1; aten_mul[512 * blockIdx.x + threadIdx.x] = (float)((double)(tx_1_1) * 1.732050807568877) * tz_1_1; } } }
I also come across with the problem. And it is because pytorch is cpu-version, so run conda install -c pytorch pytorch
can solve the problem.