RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
Hi community, While running the macrocyclic example, I got this issue:
Error executing job with overrides: ['inference.output_prefix=my_files/out_macrocycle/macrocyclic_test', 'contigmap.contigs=[10-18]', 'inference.cyclic=True', "inference.cyc_chains='a'", 'inference.num_designs=5', 'diffuser.T=50']
Traceback (most recent call last):
File "D:\Peptide Design\RFdiffusion\scripts\run_inference.py", line 94, in main
px0, x_t, seq_t, plddt = sampler.sample_step(
File "d:\peptide design\rfdiffusion\rfdiffusion\inference\model_runners.py", line 686, in sample_step
msa_prev, pair_prev, px0, state_prev, alpha, logits, plddt = self.model(msa_masked,
File "D:\miniconda3\envs\SE3nv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "d:\peptide design\rfdiffusion\rfdiffusion\RoseTTAFoldModel.py", line 77, in forward
msa_latent, pair, state = self.latent_emb(msa_latent, seq, idx, cyclic_reses)
File "D:\miniconda3\envs\SE3nv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "d:\peptide design\rfdiffusion\rfdiffusion\Embeddings.py", line 96, in forward
msa = self.emb(msa) # (B, N, L, d_model) # MSA embedding
File "D:\miniconda3\envs\SE3nv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "D:\miniconda3\envs\SE3nv\lib\site-packages\torch\nn\modules\linear.py", line 96, in forward
return F.linear(input, self.weight, self.bias)
File "D:\miniconda3\envs\SE3nv\lib\site-packages\torch\nn\functional.py", line 1847, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Nobody got this so far. Please help me solve this problem
Based on your error message this appears to be a CUDA issue. Does this only happen with the design_macrocylic_binder.sh and design_macrocyclic_monomer.sh examples or with every attempt at running RFdiffusion's inference script?
Thank you for getting back to me. I'm not sure why, but I attempted to run the unconditional monomer, and nothing abnormal happened. This problem happened when I try to do macrocycle example again