RFdiffusion
RFdiffusion copied to clipboard
[WSL2] nvrtc compilation failed
Trying to use the tool in WSL2 with my RTX4090. Windows version doesn't work (see the issue #13 ).
Everything loads fine, but then I see an error:
(SE3nv) pavel@Gigabyte-PC:~/RFdiffusion$ python scripts/run_inference.py inference.model_directory_path=/mnt/d/Models/RFdiffusion 'contigmap.contigs=[150-150]' inference.output_prefix=test_outputs/test inference.num_designs=10
Reading models from /mnt/d/Models/RFdiffusion
[2023-04-06 15:59:06,421][rfdiffusion.inference.model_runners][INFO] - Reading checkpoint from /mnt/d/Models/RFdiffusion/Base_ckpt.pt
This is inf_conf.ckpt_path
/mnt/d/Models/RFdiffusion/Base_ckpt.pt
Assembling -model, -diffuser and -preprocess configs from checkpoint
USING MODEL CONFIG: self._conf[model][n_extra_block] = 4
USING MODEL CONFIG: self._conf[model][n_main_block] = 32
USING MODEL CONFIG: self._conf[model][n_ref_block] = 4
USING MODEL CONFIG: self._conf[model][d_msa] = 256
USING MODEL CONFIG: self._conf[model][d_msa_full] = 64
USING MODEL CONFIG: self._conf[model][d_pair] = 128
USING MODEL CONFIG: self._conf[model][d_templ] = 64
USING MODEL CONFIG: self._conf[model][n_head_msa] = 8
USING MODEL CONFIG: self._conf[model][n_head_pair] = 4
USING MODEL CONFIG: self._conf[model][n_head_templ] = 4
USING MODEL CONFIG: self._conf[model][d_hidden] = 32
USING MODEL CONFIG: self._conf[model][d_hidden_templ] = 32
USING MODEL CONFIG: self._conf[model][p_drop] = 0.15
USING MODEL CONFIG: self._conf[model][SE3_param_full] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 8, 'l0_out_features': 8, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 32}
USING MODEL CONFIG: self._conf[model][SE3_param_topk] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 64, 'l0_out_features': 64, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 64}
USING MODEL CONFIG: self._conf[model][d_time_emb] = 0
USING MODEL CONFIG: self._conf[model][d_time_emb_proj] = 10
USING MODEL CONFIG: self._conf[model][freeze_track_motif] = False
USING MODEL CONFIG: self._conf[model][use_motif_timestep] = True
USING MODEL CONFIG: self._conf[diffuser][T] = 50
USING MODEL CONFIG: self._conf[diffuser][b_0] = 0.01
USING MODEL CONFIG: self._conf[diffuser][b_T] = 0.07
USING MODEL CONFIG: self._conf[diffuser][schedule_type] = linear
USING MODEL CONFIG: self._conf[diffuser][so3_type] = igso3
USING MODEL CONFIG: self._conf[diffuser][crd_scale] = 0.25
USING MODEL CONFIG: self._conf[diffuser][so3_schedule_type] = linear
USING MODEL CONFIG: self._conf[diffuser][min_b] = 1.5
USING MODEL CONFIG: self._conf[diffuser][max_b] = 2.5
USING MODEL CONFIG: self._conf[diffuser][min_sigma] = 0.02
USING MODEL CONFIG: self._conf[diffuser][max_sigma] = 1.5
USING MODEL CONFIG: self._conf[preprocess][sidechain_input] = False
USING MODEL CONFIG: self._conf[preprocess][motif_sidechain_input] = True
USING MODEL CONFIG: self._conf[preprocess][d_t1d] = 22
USING MODEL CONFIG: self._conf[preprocess][d_t2d] = 44
USING MODEL CONFIG: self._conf[preprocess][prob_self_cond] = 0.5
USING MODEL CONFIG: self._conf[preprocess][str_self_cond] = True
USING MODEL CONFIG: self._conf[preprocess][predict_previous] = False
[2023-04-06 15:59:10,778][rfdiffusion.inference.model_runners][INFO] - Loading checkpoint.
[2023-04-06 15:59:13,459][rfdiffusion.diffusion][INFO] - Calculating IGSO3.
Successful diffuser __init__
[2023-04-06 15:59:17,256][__main__][INFO] - Making design test_outputs/test_0
[2023-04-06 15:59:17,260][rfdiffusion.inference.model_runners][INFO] - Using contig: ['150-150']
With this beta schedule (linear schedule, beta_0 = 0.04, beta_T = 0.28), alpha_bar_T = 0.00013696048699785024
[2023-04-06 15:59:17,271][rfdiffusion.inference.model_runners][INFO] - Sequence init: ------------------------------------------------------------------------------------------------------------------------------------------------------
Error executing job with overrides: ['inference.model_directory_path=/mnt/d/Models/RFdiffusion', 'contigmap.contigs=[150-150]', 'inference.output_prefix=test_outputs/test', 'inference.num_designs=10']
Traceback (most recent call last):
File "/home/pavel/RFdiffusion/scripts/run_inference.py", line 85, in main
px0, x_t, seq_t, plddt = sampler.sample_step(
File "/home/pavel/RFdiffusion/rfdiffusion/inference/model_runners.py", line 665, in sample_step
msa_prev, pair_prev, px0, state_prev, alpha, logits, plddt = self.model(msa_masked,
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/RFdiffusion/rfdiffusion/RoseTTAFoldModel.py", line 114, in forward
msa, pair, R, T, alpha_s, state = self.simulator(seq, msa_latent, msa_full, pair, xyz[:,:,:3],
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/RFdiffusion/rfdiffusion/Track_module.py", line 420, in forward
msa_full, pair, R_in, T_in, state, alpha = self.extra_block[i_m](msa_full,
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/RFdiffusion/rfdiffusion/Track_module.py", line 332, in forward
R, T, state, alpha = self.str2str(msa, pair, R_in, T_in, xyz, state, idx, motif_mask=motif_mask, top_k=0)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/cuda/amp/autocast_mode.py", line 141, in decorate_autocast
return func(*args, **kwargs)
File "/home/pavel/RFdiffusion/rfdiffusion/Track_module.py", line 266, in forward
shift = self.se3(G, node.reshape(B*L, -1, 1), l1_feats, edge_feats)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/RFdiffusion/rfdiffusion/SE3_network.py", line 83, in forward
return self.se3(G, node_features, edge_features)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/transformer.py", line 140, in forward
basis = basis or get_basis(graph.edata['rel_pos'], max_degree=self.max_degree, compute_gradients=False,
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/basis.py", line 167, in get_basis
spherical_harmonics = get_spherical_harmonics(relative_pos, max_degree)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/basis.py", line 58, in get_spherical_harmonics
sh = o3.spherical_harmonics(all_degrees, relative_pos, normalize=True)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/e3nn/o3/_spherical_harmonics.py", line 180, in spherical_harmonics
return sh(x)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/e3nn/o3/_spherical_harmonics.py", line 82, in forward
sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2])
RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch)
nvrtc compilation failed:
#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)
template<typename T>
__device__ T maximum(T a, T b) {
return isnan(a) ? a : (a > b ? a : b);
}
template<typename T>
__device__ T minimum(T a, T b) {
return isnan(a) ? a : (a < b ? a : b);
}
extern "C" __global__
void fused_pow_pow_pow_su_9196483836509741110(float* tz_1, float* ty_1, float* tx_1, float* aten_mul, float* aten_mul_1, float* aten_mul_2, float* aten_sub, float* aten_add, float* aten_mul_3, float* aten_pow) {
{
if (512 * blockIdx.x + threadIdx.x<22350 ? 1 : 0) {
float ty_1_1 = __ldg(ty_1 + 3 * (512 * blockIdx.x + threadIdx.x));
aten_pow[512 * blockIdx.x + threadIdx.x] = ty_1_1 * ty_1_1;
float tz_1_1 = __ldg(tz_1 + 3 * (512 * blockIdx.x + threadIdx.x));
float tx_1_1 = __ldg(tx_1 + 3 * (512 * blockIdx.x + threadIdx.x));
aten_mul_3[512 * blockIdx.x + threadIdx.x] = (float)((double)(tz_1_1 * tz_1_1 - tx_1_1 * tx_1_1) * 0.8660254037844386);
aten_add[512 * blockIdx.x + threadIdx.x] = tx_1_1 * tx_1_1 + tz_1_1 * tz_1_1;
aten_sub[512 * blockIdx.x + threadIdx.x] = ty_1_1 * ty_1_1 - (float)((double)(tx_1_1 * tx_1_1 + tz_1_1 * tz_1_1) * 0.5);
aten_mul_2[512 * blockIdx.x + threadIdx.x] = (float)((double)(ty_1_1) * 1.732050807568877) * tz_1_1;
aten_mul_1[512 * blockIdx.x + threadIdx.x] = (float)((double)(tx_1_1) * 1.732050807568877) * ty_1_1;
aten_mul[512 * blockIdx.x + threadIdx.x] = (float)((double)(tx_1_1) * 1.732050807568877) * tz_1_1;
}
}
}
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
same problem, did you fix it?
fixed this problem by updating pytorch with cudatoolkit to 11.8
Thank! I've fixed the package for Windows environment.
fixed this problem by updating pytorch with cudatoolkit to 11.8
Hey, I just met the same problem. I wonder did you mean that I should find a pytorch version matches cudatoolkit 11.8 and update the pytorch to this version? I found that pytorch 2.0.0 can match cuda 11.8, should I upload in this way # CUDA 11.8 conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
Thanks! I uninstalled the cudatoolkit conda uninstall cudatoolkit cudnn and cudnn, and installed cudatoolkit==11.8 and cudnn conda install cudatoolkit cudnn. And finally run conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia. And this problem can be sovled.