alphafold
alphafold copied to clipboard
Cuda error appeared when I was running AlphaFold version >=2.2.4
I tried to run AlphaFold latest version on a new machine with GPU RTX4090, CUDA version 11.8 (downgraded from 12.2), Ubuntu 22.04 LTS. I used anaconda3 to build the alphafold environment. However, all of the alphafold version >=2.24 showed the same error.
I was wondering if anyone could help me to solve this? Please let me know if you need anything else. Thank you soooo much!
I've tried this method #646 , but it didn't work.
Nividia info: `(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ nvidia-smi Wed May 17 16:49:49 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA Graphics... On | 00000000:01:00.0 Off | Off | | 30% 42C P8 24W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA Graphics... On | 00000000:08:00.0 Off | Off | | 30% 41C P8 19W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ `
Error message:
(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ python docker/run_docker.py --fasta_paths=/home/soft/Documents/8GZ6.fasta --max_template_date=3000-01-01 --data_dir=/soft/AF2/download/ --output_dir=/home/soft/Documents/8GZ6/ I0517 16:41:16.903001 140650292352064 run_docker.py:113] Mounting /home/soft/Documents -> /mnt/fasta_path_0 I0517 16:41:16.903073 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref90 -> /mnt/uniref90_database_path I0517 16:41:16.903110 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/mgnify -> /mnt/mgnify_database_path I0517 16:41:16.903137 140650292352064 run_docker.py:113] Mounting /soft/AF2/download -> /mnt/data_dir I0517 16:41:16.903162 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir I0517 16:41:16.903189 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif -> /mnt/obsolete_pdbs_path I0517 16:41:16.903218 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb70 -> /mnt/pdb70_database_path I0517 16:41:16.903246 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref30 -> /mnt/uniref30_database_path I0517 16:41:16.903274 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/bfd -> /mnt/bfd_database_path I0517 16:41:18.367146 140650292352064 run_docker.py:255] I0517 08:41:18.366440 139962212468544 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat. I0517 16:41:18.513591 140650292352064 run_docker.py:255] I0517 08:41:18.513257 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I0517 16:41:18.642552 140650292352064 run_docker.py:255] I0517 08:41:18.642126 139962212468544 xla_bridge.py:353] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA I0517 16:41:18.642671 140650292352064 run_docker.py:255] I0517 08:41:18.642353 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' I0517 16:41:18.642703 140650292352064 run_docker.py:255] I0517 08:41:18.642385 139962212468544 xla_bridge.py:353] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. I0517 16:41:20.699224 140650292352064 run_docker.py:255] I0517 08:41:20.698850 139962212468544 run_alphafold.py:386] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0'] I0517 16:41:20.699338 140650292352064 run_docker.py:255] I0517 08:41:20.698937 139962212468544 run_alphafold.py:403] Using random seed 979532966947835319 for the data pipeline I0517 16:41:20.699380 140650292352064 run_docker.py:255] I0517 08:41:20.699029 139962212468544 run_alphafold.py:161] Predicting 8GZ6 I0517 16:41:20.699409 140650292352064 run_docker.py:255] I0517 08:41:20.699223 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpxzhf05lq/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/uniref90_database_path/uniref90.fasta" I0517 16:41:20.758846 140650292352064 run_docker.py:255] I0517 08:41:20.758264 139962212468544 utils.py:36] Started Jackhmmer (uniref90.fasta) query I0517 16:43:32.400591 140650292352064 run_docker.py:255] I0517 08:43:32.399554 139962212468544 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 131.641 seconds I0517 16:43:32.550053 140650292352064 run_docker.py:255] I0517 08:43:32.549127 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp05y00c2a/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/mgnify_database_path/mgy_clusters_2022_05.fa" I0517 16:43:32.603918 140650292352064 run_docker.py:255] I0517 08:43:32.603443 139962212468544 utils.py:36] Started Jackhmmer (mgy_clusters_2022_05.fa) query I0517 16:46:37.409147 140650292352064 run_docker.py:255] I0517 08:46:37.407801 139962212468544 utils.py:40] Finished Jackhmmer (mgy_clusters_2022_05.fa) query in 184.804 seconds I0517 16:46:37.951393 140650292352064 run_docker.py:255] I0517 08:46:37.950919 139962212468544 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpp63aisip/query.a3m -o /tmp/tmpp63aisip/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70" I0517 16:46:38.000887 140650292352064 run_docker.py:255] I0517 08:46:38.000401 139962212468544 utils.py:36] Started HHsearch query I0517 16:46:49.137153 140650292352064 run_docker.py:255] I0517 08:46:49.136655 139962212468544 utils.py:40] Finished HHsearch query in 11.136 seconds I0517 16:46:49.446268 140650292352064 run_docker.py:255] I0517 08:46:49.445886 139962212468544 hhblits.py:128] Launching subprocess "/usr/bin/hhblits -i /mnt/fasta_path_0/8GZ6.fasta -cpu 4 -oa3m /tmp/tmpzq5150hf/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /mnt/uniref30_database_path/UniRef30_2021_03" I0517 16:46:49.497454 140650292352064 run_docker.py:255] I0517 08:46:49.497032 139962212468544 utils.py:36] Started HHblits query I0517 16:48:52.683240 140650292352064 run_docker.py:255] I0517 08:48:52.682735 139962212468544 utils.py:40] Finished HHblits query in 123.186 seconds I0517 16:48:52.695582 140650292352064 run_docker.py:255] I0517 08:48:52.695324 139962212468544 templates.py:878] Searching for template for: QVQLQESGGGLVQAGGSLRLSCAASGRTSSVYNMAWFRQTPGKEREFVAAITGNGGTTLYADSVKGRLTISRGNAKNTVSLQMNVLKPDDTAVYYCAAGGWGKERNYAYWGQGTQVTVSSHHHHHH I0517 16:48:54.830226 140650292352064 run_docker.py:255] I0517 08:48:54.829735 139962212468544 templates.py:267] Found an exact template match 6qd6_C. I0517 16:48:54.838274 140650292352064 run_docker.py:255] I0517 08:48:54.837990 139962212468544 templates.py:267] Found an exact template match 6qd6_G. I0517 16:48:54.930829 140650292352064 run_docker.py:255] I0517 08:48:54.930498 139962212468544 templates.py:267] Found an exact template match 5wts_A. I0517 16:48:55.026873 140650292352064 run_docker.py:255] I0517 08:48:55.026528 139962212468544 templates.py:267] Found an exact template match 6gjs_B. I0517 16:48:55.637419 140650292352064 run_docker.py:255] I0517 08:48:55.636942 139962212468544 templates.py:267] Found an exact template match 6gkd_B. I0517 16:48:55.895157 140650292352064 run_docker.py:255] I0517 08:48:55.894784 139962212468544 templates.py:267] Found an exact template match 6hd8_A. I0517 16:48:56.275886 140650292352064 run_docker.py:255] I0517 08:48:56.275430 139962212468544 templates.py:267] Found an exact template match 6hd9_A. I0517 16:48:56.319466 140650292352064 run_docker.py:255] I0517 08:48:56.319131 139962212468544 templates.py:267] Found an exact template match 6rul_A. I0517 16:48:56.451577 140650292352064 run_docker.py:255] I0517 08:48:56.451225 139962212468544 templates.py:267] Found an exact template match 4pfe_A. I0517 16:48:56.701474 140650292352064 run_docker.py:255] I0517 08:48:56.701014 139962212468544 templates.py:267] Found an exact template match 3sn6_N. I0517 16:48:56.990272 140650292352064 run_docker.py:255] I0517 08:48:56.989793 139962212468544 templates.py:267] Found an exact template match 6pb1_N. I0517 16:48:57.033105 140650292352064 run_docker.py:255] I0517 08:48:57.032745 139962212468544 templates.py:267] Found an exact template match 6rum_A. I0517 16:48:57.086215 140650292352064 run_docker.py:255] I0517 08:48:57.085903 139962212468544 templates.py:267] Found an exact template match 5wb1_A. I0517 16:48:57.110778 140650292352064 run_docker.py:255] I0517 08:48:57.110478 139962212468544 templates.py:267] Found an exact template match 5vm6_A. I0517 16:48:57.186338 140650292352064 run_docker.py:255] I0517 08:48:57.186022 139962212468544 templates.py:267] Found an exact template match 5foj_A. I0517 16:48:57.237214 140650292352064 run_docker.py:255] I0517 08:48:57.236900 139962212468544 templates.py:267] Found an exact template match 5m2w_A. I0517 16:48:57.284224 140650292352064 run_docker.py:255] I0517 08:48:57.283923 139962212468544 templates.py:267] Found an exact template match 5mje_B. I0517 16:48:57.479903 140650292352064 run_docker.py:255] I0517 08:48:57.479458 139962212468544 templates.py:267] Found an exact template match 5vm4_L. I0517 16:48:57.845450 140650292352064 run_docker.py:255] I0517 08:48:57.844991 139962212468544 templates.py:267] Found an exact template match 4cdg_D. I0517 16:48:57.887320 140650292352064 run_docker.py:255] I0517 08:48:57.887031 139962212468544 templates.py:267] Found an exact template match 4gft_B. I0517 16:48:57.988736 140650292352064 run_docker.py:255] I0517 08:48:57.988251 139962212468544 pipeline.py:234] Uniref90 MSA size: 10000 sequences. I0517 16:48:57.988856 140650292352064 run_docker.py:255] I0517 08:48:57.988335 139962212468544 pipeline.py:235] BFD MSA size: 1612 sequences. I0517 16:48:57.988883 140650292352064 run_docker.py:255] I0517 08:48:57.988350 139962212468544 pipeline.py:236] MGnify MSA size: 501 sequences. I0517 16:48:57.988906 140650292352064 run_docker.py:255] I0517 08:48:57.988364 139962212468544 pipeline.py:237] Final (deduplicated) MSA size: 12020 sequences. I0517 16:48:57.988928 140650292352064 run_docker.py:255] I0517 08:48:57.988502 139962212468544 pipeline.py:239] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20. I0517 16:48:58.114603 140650292352064 run_docker.py:255] I0517 08:48:58.113753 139962212468544 run_alphafold.py:191] Running model model_1_pred_0 on 8GZ6 I0517 16:48:59.508914 140650292352064 run_docker.py:255] I0517 08:48:59.508350 139962212468544 model.py:165] Running predict with shape(feat) = {'aatype': (4, 126), 'residue_index': (4, 126), 'seq_length': (4,), 'template_aatype': (4, 4, 126), 'template_all_atom_masks': (4, 4, 126, 37), 'template_all_atom_positions': (4, 4, 126, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 126), 'msa_mask': (4, 508, 126), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 126, 3), 'template_pseudo_beta_mask': (4, 4, 126), 'atom14_atom_exists': (4, 126, 14), 'residx_atom14_to_atom37': (4, 126, 14), 'residx_atom37_to_atom14': (4, 126, 37), 'atom37_atom_exists': (4, 126, 37), 'extra_msa': (4, 5120, 126), 'extra_msa_mask': (4, 5120, 126), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 126), 'true_msa': (4, 508, 126), 'extra_has_deletion': (4, 5120, 126), 'extra_deletion_value': (4, 5120, 126), 'msa_feat': (4, 508, 126, 49), 'target_feat': (4, 126, 22)} I0517 16:48:59.600997 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600504: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:231] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9 I0517 16:48:59.601204 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600544: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas I0517 16:48:59.608552 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608081: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found I0517 16:48:59.608791 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608157: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function I0517 16:48:59.612349 140650292352064 run_docker.py:255] Traceback (most recent call last): I0517 16:48:59.612429 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 432, in <module> I0517 16:48:59.612536 140650292352064 run_docker.py:255] app.run(main) I0517 16:48:59.612603 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I0517 16:48:59.612677 140650292352064 run_docker.py:255] _run_main(main, args) I0517 16:48:59.612745 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I0517 16:48:59.612811 140650292352064 run_docker.py:255] sys.exit(main(argv)) I0517 16:48:59.612883 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 408, in main I0517 16:48:59.612949 140650292352064 run_docker.py:255] predict_structure( I0517 16:48:59.613012 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure I0517 16:48:59.613077 140650292352064 run_docker.py:255] prediction_result = model_runner.predict(processed_feature_dict, I0517 16:48:59.613144 140650292352064 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict I0517 16:48:59.613205 140650292352064 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat) I0517 16:48:59.613268 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/random.py", line 132, in PRNGKey I0517 16:48:59.613330 140650292352064 run_docker.py:255] key = prng.seed_with_impl(impl, seed) I0517 16:48:59.613391 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 267, in seed_with_impl I0517 16:48:59.613450 140650292352064 run_docker.py:255] return random_seed(seed, impl=impl) I0517 16:48:59.613508 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 580, in random_seed I0517 16:48:59.613569 140650292352064 run_docker.py:255] return random_seed_p.bind(seeds_arr, impl=impl) I0517 16:48:59.613629 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.613687 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.613749 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.613809 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.613869 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.613931 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.613995 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 592, in random_seed_impl I0517 16:48:59.614058 140650292352064 run_docker.py:255] base_arr = random_seed_impl_base(seeds, impl=impl) I0517 16:48:59.614119 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base I0517 16:48:59.614181 140650292352064 run_docker.py:255] return seed(seeds) I0517 16:48:59.614244 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 832, in threefry_seed I0517 16:48:59.614307 140650292352064 run_docker.py:255] lax.shift_right_logical(seed, lax_internal._const(seed, 32))) I0517 16:48:59.614370 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical I0517 16:48:59.614432 140650292352064 run_docker.py:255] return shift_right_logical_p.bind(x, y) I0517 16:48:59.614495 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.614555 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.614619 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.614684 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.614765 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.614831 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.614899 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive I0517 16:48:59.614965 140650292352064 run_docker.py:255] return compiled_fun(*args) I0517 16:48:59.615031 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 200, in <lambda> I0517 16:48:59.615100 140650292352064 run_docker.py:255] return lambda *args, **kw: compiled(*args, **kw)[0] I0517 16:48:59.615169 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled I0517 16:48:59.615240 140650292352064 run_docker.py:255] out_flat = compiled.execute(in_flat) I0517 16:48:59.615311 140650292352064 run_docker.py:255] jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
: (512,), 'bert_mask': (512, 550), 'seq_mask': (550,), 'msa_mask': (512, 550)}
I0526 16:59:03.898252 140519854102336 run_docker.py:235] Traceback (most recent call last):
I0526 16:59:03.898312 140519854102336 run_docker.py:235] File "/app/alphafold/run_alphafold.py", line 459, in
Same jaxlib.xla_extension.XlaRuntimeError: FAILED_PRECONDITION: Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version error
Same jaxlib.xla_extension.XlaRuntimeError: FAILED_PRECONDITION: Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version error
I got the same error while trying to compile with CUDA 11.8 tools. But with my older CUDA 11.4 tools it works, despite my kernel CUDA version being much newer.
What I'm running:
- V100
- Driver 535.54.03
- nvidia-smi shows CUDA Version 12.2
Docker image compiled with all 11.4 tools works, using 11.8 doesn't.
For 4090 machine, you need to change the followings in dockfile:
ARG CUDA=11.1.1------->ARG CUDA=11.8.0 FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu18.04------->FROM nvidia/cuda:${CUDA}-cudnn8-devel-ubuntu20.04
Then, rebuild.
@HanLiii Thanks for your information. Even though I changed it as you said and built it for 4090 machine, the same error comes up. Did you succeed by changing it like that?
@RJ3 can you share your working Dockerfile please?
I tried to run AlphaFold latest version on a new machine with GPU RTX4090, CUDA version 11.8 (downgraded from 12.2), Ubuntu 22.04 LTS. I used anaconda3 to build the alphafold environment. However, all of the alphafold version >=2.24 showed the same error.
I was wondering if anyone could help me to solve this? Please let me know if you need anything else. Thank you soooo much!
I've tried this method #646 , but it didn't work.
Nividia info: `(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ nvidia-smi Wed May 17 16:49:49 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA Graphics... On | 00000000:01:00.0 Off | Off | | 30% 42C P8 24W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA Graphics... On | 00000000:08:00.0 Off | Off | | 30% 41C P8 19W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ `
Error message:
(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ python docker/run_docker.py --fasta_paths=/home/soft/Documents/8GZ6.fasta --max_template_date=3000-01-01 --data_dir=/soft/AF2/download/ --output_dir=/home/soft/Documents/8GZ6/ I0517 16:41:16.903001 140650292352064 run_docker.py:113] Mounting /home/soft/Documents -> /mnt/fasta_path_0 I0517 16:41:16.903073 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref90 -> /mnt/uniref90_database_path I0517 16:41:16.903110 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/mgnify -> /mnt/mgnify_database_path I0517 16:41:16.903137 140650292352064 run_docker.py:113] Mounting /soft/AF2/download -> /mnt/data_dir I0517 16:41:16.903162 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir I0517 16:41:16.903189 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif -> /mnt/obsolete_pdbs_path I0517 16:41:16.903218 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb70 -> /mnt/pdb70_database_path I0517 16:41:16.903246 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref30 -> /mnt/uniref30_database_path I0517 16:41:16.903274 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/bfd -> /mnt/bfd_database_path I0517 16:41:18.367146 140650292352064 run_docker.py:255] I0517 08:41:18.366440 139962212468544 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat. I0517 16:41:18.513591 140650292352064 run_docker.py:255] I0517 08:41:18.513257 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I0517 16:41:18.642552 140650292352064 run_docker.py:255] I0517 08:41:18.642126 139962212468544 xla_bridge.py:353] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA I0517 16:41:18.642671 140650292352064 run_docker.py:255] I0517 08:41:18.642353 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' I0517 16:41:18.642703 140650292352064 run_docker.py:255] I0517 08:41:18.642385 139962212468544 xla_bridge.py:353] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. I0517 16:41:20.699224 140650292352064 run_docker.py:255] I0517 08:41:20.698850 139962212468544 run_alphafold.py:386] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0'] I0517 16:41:20.699338 140650292352064 run_docker.py:255] I0517 08:41:20.698937 139962212468544 run_alphafold.py:403] Using random seed 979532966947835319 for the data pipeline I0517 16:41:20.699380 140650292352064 run_docker.py:255] I0517 08:41:20.699029 139962212468544 run_alphafold.py:161] Predicting 8GZ6 I0517 16:41:20.699409 140650292352064 run_docker.py:255] I0517 08:41:20.699223 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpxzhf05lq/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/uniref90_database_path/uniref90.fasta" I0517 16:41:20.758846 140650292352064 run_docker.py:255] I0517 08:41:20.758264 139962212468544 utils.py:36] Started Jackhmmer (uniref90.fasta) query I0517 16:43:32.400591 140650292352064 run_docker.py:255] I0517 08:43:32.399554 139962212468544 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 131.641 seconds I0517 16:43:32.550053 140650292352064 run_docker.py:255] I0517 08:43:32.549127 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp05y00c2a/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/mgnify_database_path/mgy_clusters_2022_05.fa" I0517 16:43:32.603918 140650292352064 run_docker.py:255] I0517 08:43:32.603443 139962212468544 utils.py:36] Started Jackhmmer (mgy_clusters_2022_05.fa) query I0517 16:46:37.409147 140650292352064 run_docker.py:255] I0517 08:46:37.407801 139962212468544 utils.py:40] Finished Jackhmmer (mgy_clusters_2022_05.fa) query in 184.804 seconds I0517 16:46:37.951393 140650292352064 run_docker.py:255] I0517 08:46:37.950919 139962212468544 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpp63aisip/query.a3m -o /tmp/tmpp63aisip/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70" I0517 16:46:38.000887 140650292352064 run_docker.py:255] I0517 08:46:38.000401 139962212468544 utils.py:36] Started HHsearch query I0517 16:46:49.137153 140650292352064 run_docker.py:255] I0517 08:46:49.136655 139962212468544 utils.py:40] Finished HHsearch query in 11.136 seconds I0517 16:46:49.446268 140650292352064 run_docker.py:255] I0517 08:46:49.445886 139962212468544 hhblits.py:128] Launching subprocess "/usr/bin/hhblits -i /mnt/fasta_path_0/8GZ6.fasta -cpu 4 -oa3m /tmp/tmpzq5150hf/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /mnt/uniref30_database_path/UniRef30_2021_03" I0517 16:46:49.497454 140650292352064 run_docker.py:255] I0517 08:46:49.497032 139962212468544 utils.py:36] Started HHblits query I0517 16:48:52.683240 140650292352064 run_docker.py:255] I0517 08:48:52.682735 139962212468544 utils.py:40] Finished HHblits query in 123.186 seconds I0517 16:48:52.695582 140650292352064 run_docker.py:255] I0517 08:48:52.695324 139962212468544 templates.py:878] Searching for template for: QVQLQESGGGLVQAGGSLRLSCAASGRTSSVYNMAWFRQTPGKEREFVAAITGNGGTTLYADSVKGRLTISRGNAKNTVSLQMNVLKPDDTAVYYCAAGGWGKERNYAYWGQGTQVTVSSHHHHHH I0517 16:48:54.830226 140650292352064 run_docker.py:255] I0517 08:48:54.829735 139962212468544 templates.py:267] Found an exact template match 6qd6_C. I0517 16:48:54.838274 140650292352064 run_docker.py:255] I0517 08:48:54.837990 139962212468544 templates.py:267] Found an exact template match 6qd6_G. I0517 16:48:54.930829 140650292352064 run_docker.py:255] I0517 08:48:54.930498 139962212468544 templates.py:267] Found an exact template match 5wts_A. I0517 16:48:55.026873 140650292352064 run_docker.py:255] I0517 08:48:55.026528 139962212468544 templates.py:267] Found an exact template match 6gjs_B. I0517 16:48:55.637419 140650292352064 run_docker.py:255] I0517 08:48:55.636942 139962212468544 templates.py:267] Found an exact template match 6gkd_B. I0517 16:48:55.895157 140650292352064 run_docker.py:255] I0517 08:48:55.894784 139962212468544 templates.py:267] Found an exact template match 6hd8_A. I0517 16:48:56.275886 140650292352064 run_docker.py:255] I0517 08:48:56.275430 139962212468544 templates.py:267] Found an exact template match 6hd9_A. I0517 16:48:56.319466 140650292352064 run_docker.py:255] I0517 08:48:56.319131 139962212468544 templates.py:267] Found an exact template match 6rul_A. I0517 16:48:56.451577 140650292352064 run_docker.py:255] I0517 08:48:56.451225 139962212468544 templates.py:267] Found an exact template match 4pfe_A. I0517 16:48:56.701474 140650292352064 run_docker.py:255] I0517 08:48:56.701014 139962212468544 templates.py:267] Found an exact template match 3sn6_N. I0517 16:48:56.990272 140650292352064 run_docker.py:255] I0517 08:48:56.989793 139962212468544 templates.py:267] Found an exact template match 6pb1_N. I0517 16:48:57.033105 140650292352064 run_docker.py:255] I0517 08:48:57.032745 139962212468544 templates.py:267] Found an exact template match 6rum_A. I0517 16:48:57.086215 140650292352064 run_docker.py:255] I0517 08:48:57.085903 139962212468544 templates.py:267] Found an exact template match 5wb1_A. I0517 16:48:57.110778 140650292352064 run_docker.py:255] I0517 08:48:57.110478 139962212468544 templates.py:267] Found an exact template match 5vm6_A. I0517 16:48:57.186338 140650292352064 run_docker.py:255] I0517 08:48:57.186022 139962212468544 templates.py:267] Found an exact template match 5foj_A. I0517 16:48:57.237214 140650292352064 run_docker.py:255] I0517 08:48:57.236900 139962212468544 templates.py:267] Found an exact template match 5m2w_A. I0517 16:48:57.284224 140650292352064 run_docker.py:255] I0517 08:48:57.283923 139962212468544 templates.py:267] Found an exact template match 5mje_B. I0517 16:48:57.479903 140650292352064 run_docker.py:255] I0517 08:48:57.479458 139962212468544 templates.py:267] Found an exact template match 5vm4_L. I0517 16:48:57.845450 140650292352064 run_docker.py:255] I0517 08:48:57.844991 139962212468544 templates.py:267] Found an exact template match 4cdg_D. I0517 16:48:57.887320 140650292352064 run_docker.py:255] I0517 08:48:57.887031 139962212468544 templates.py:267] Found an exact template match 4gft_B. I0517 16:48:57.988736 140650292352064 run_docker.py:255] I0517 08:48:57.988251 139962212468544 pipeline.py:234] Uniref90 MSA size: 10000 sequences. I0517 16:48:57.988856 140650292352064 run_docker.py:255] I0517 08:48:57.988335 139962212468544 pipeline.py:235] BFD MSA size: 1612 sequences. I0517 16:48:57.988883 140650292352064 run_docker.py:255] I0517 08:48:57.988350 139962212468544 pipeline.py:236] MGnify MSA size: 501 sequences. I0517 16:48:57.988906 140650292352064 run_docker.py:255] I0517 08:48:57.988364 139962212468544 pipeline.py:237] Final (deduplicated) MSA size: 12020 sequences. I0517 16:48:57.988928 140650292352064 run_docker.py:255] I0517 08:48:57.988502 139962212468544 pipeline.py:239] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20. I0517 16:48:58.114603 140650292352064 run_docker.py:255] I0517 08:48:58.113753 139962212468544 run_alphafold.py:191] Running model model_1_pred_0 on 8GZ6 I0517 16:48:59.508914 140650292352064 run_docker.py:255] I0517 08:48:59.508350 139962212468544 model.py:165] Running predict with shape(feat) = {'aatype': (4, 126), 'residue_index': (4, 126), 'seq_length': (4,), 'template_aatype': (4, 4, 126), 'template_all_atom_masks': (4, 4, 126, 37), 'template_all_atom_positions': (4, 4, 126, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 126), 'msa_mask': (4, 508, 126), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 126, 3), 'template_pseudo_beta_mask': (4, 4, 126), 'atom14_atom_exists': (4, 126, 14), 'residx_atom14_to_atom37': (4, 126, 14), 'residx_atom37_to_atom14': (4, 126, 37), 'atom37_atom_exists': (4, 126, 37), 'extra_msa': (4, 5120, 126), 'extra_msa_mask': (4, 5120, 126), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 126), 'true_msa': (4, 508, 126), 'extra_has_deletion': (4, 5120, 126), 'extra_deletion_value': (4, 5120, 126), 'msa_feat': (4, 508, 126, 49), 'target_feat': (4, 126, 22)} I0517 16:48:59.600997 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600504: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:231] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9 I0517 16:48:59.601204 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600544: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas I0517 16:48:59.608552 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608081: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found I0517 16:48:59.608791 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608157: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function I0517 16:48:59.612349 140650292352064 run_docker.py:255] Traceback (most recent call last): I0517 16:48:59.612429 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 432, in <module> I0517 16:48:59.612536 140650292352064 run_docker.py:255] app.run(main) I0517 16:48:59.612603 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I0517 16:48:59.612677 140650292352064 run_docker.py:255] _run_main(main, args) I0517 16:48:59.612745 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I0517 16:48:59.612811 140650292352064 run_docker.py:255] sys.exit(main(argv)) I0517 16:48:59.612883 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 408, in main I0517 16:48:59.612949 140650292352064 run_docker.py:255] predict_structure( I0517 16:48:59.613012 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure I0517 16:48:59.613077 140650292352064 run_docker.py:255] prediction_result = model_runner.predict(processed_feature_dict, I0517 16:48:59.613144 140650292352064 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict I0517 16:48:59.613205 140650292352064 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat) I0517 16:48:59.613268 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/random.py", line 132, in PRNGKey I0517 16:48:59.613330 140650292352064 run_docker.py:255] key = prng.seed_with_impl(impl, seed) I0517 16:48:59.613391 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 267, in seed_with_impl I0517 16:48:59.613450 140650292352064 run_docker.py:255] return random_seed(seed, impl=impl) I0517 16:48:59.613508 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 580, in random_seed I0517 16:48:59.613569 140650292352064 run_docker.py:255] return random_seed_p.bind(seeds_arr, impl=impl) I0517 16:48:59.613629 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.613687 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.613749 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.613809 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.613869 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.613931 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.613995 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 592, in random_seed_impl I0517 16:48:59.614058 140650292352064 run_docker.py:255] base_arr = random_seed_impl_base(seeds, impl=impl) I0517 16:48:59.614119 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base I0517 16:48:59.614181 140650292352064 run_docker.py:255] return seed(seeds) I0517 16:48:59.614244 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 832, in threefry_seed I0517 16:48:59.614307 140650292352064 run_docker.py:255] lax.shift_right_logical(seed, lax_internal._const(seed, 32))) I0517 16:48:59.614370 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical I0517 16:48:59.614432 140650292352064 run_docker.py:255] return shift_right_logical_p.bind(x, y) I0517 16:48:59.614495 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.614555 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.614619 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.614684 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.614765 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.614831 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.614899 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive I0517 16:48:59.614965 140650292352064 run_docker.py:255] return compiled_fun(*args) I0517 16:48:59.615031 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 200, in <lambda> I0517 16:48:59.615100 140650292352064 run_docker.py:255] return lambda *args, **kw: compiled(*args, **kw)[0] I0517 16:48:59.615169 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled I0517 16:48:59.615240 140650292352064 run_docker.py:255] out_flat = compiled.execute(in_flat) I0517 16:48:59.615311 140650292352064 run_docker.py:255] jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
I got exactly same ERROR. Have you solved it, bro?
My nividia info is as follow:
Fri Nov 17 14:21:52 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | Off |
| 0% 41C P8 28W / 450W | 6MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3164 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------+
I tried to run AlphaFold latest version on a new machine with GPU RTX4090, CUDA version 11.8 (downgraded from 12.2), Ubuntu 22.04 LTS. I used anaconda3 to build the alphafold environment. However, all of the alphafold version >=2.24 showed the same error. I was wondering if anyone could help me to solve this? Please let me know if you need anything else. Thank you soooo much! I've tried this method #646 , but it didn't work. Nividia info:
(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ nvidia-smi Wed May 17 16:49:49 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA Graphics... On | 00000000:01:00.0 Off | Off | | 30% 42C P8 24W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA Graphics... On | 00000000:08:00.0 Off | Off | | 30% 41C P8 19W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Error message:(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ python docker/run_docker.py --fasta_paths=/home/soft/Documents/8GZ6.fasta --max_template_date=3000-01-01 --data_dir=/soft/AF2/download/ --output_dir=/home/soft/Documents/8GZ6/ I0517 16:41:16.903001 140650292352064 run_docker.py:113] Mounting /home/soft/Documents -> /mnt/fasta_path_0 I0517 16:41:16.903073 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref90 -> /mnt/uniref90_database_path I0517 16:41:16.903110 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/mgnify -> /mnt/mgnify_database_path I0517 16:41:16.903137 140650292352064 run_docker.py:113] Mounting /soft/AF2/download -> /mnt/data_dir I0517 16:41:16.903162 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir I0517 16:41:16.903189 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif -> /mnt/obsolete_pdbs_path I0517 16:41:16.903218 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb70 -> /mnt/pdb70_database_path I0517 16:41:16.903246 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref30 -> /mnt/uniref30_database_path I0517 16:41:16.903274 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/bfd -> /mnt/bfd_database_path I0517 16:41:18.367146 140650292352064 run_docker.py:255] I0517 08:41:18.366440 139962212468544 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat. I0517 16:41:18.513591 140650292352064 run_docker.py:255] I0517 08:41:18.513257 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I0517 16:41:18.642552 140650292352064 run_docker.py:255] I0517 08:41:18.642126 139962212468544 xla_bridge.py:353] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA I0517 16:41:18.642671 140650292352064 run_docker.py:255] I0517 08:41:18.642353 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' I0517 16:41:18.642703 140650292352064 run_docker.py:255] I0517 08:41:18.642385 139962212468544 xla_bridge.py:353] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. I0517 16:41:20.699224 140650292352064 run_docker.py:255] I0517 08:41:20.698850 139962212468544 run_alphafold.py:386] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0'] I0517 16:41:20.699338 140650292352064 run_docker.py:255] I0517 08:41:20.698937 139962212468544 run_alphafold.py:403] Using random seed 979532966947835319 for the data pipeline I0517 16:41:20.699380 140650292352064 run_docker.py:255] I0517 08:41:20.699029 139962212468544 run_alphafold.py:161] Predicting 8GZ6 I0517 16:41:20.699409 140650292352064 run_docker.py:255] I0517 08:41:20.699223 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpxzhf05lq/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/uniref90_database_path/uniref90.fasta" I0517 16:41:20.758846 140650292352064 run_docker.py:255] I0517 08:41:20.758264 139962212468544 utils.py:36] Started Jackhmmer (uniref90.fasta) query I0517 16:43:32.400591 140650292352064 run_docker.py:255] I0517 08:43:32.399554 139962212468544 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 131.641 seconds I0517 16:43:32.550053 140650292352064 run_docker.py:255] I0517 08:43:32.549127 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp05y00c2a/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/mgnify_database_path/mgy_clusters_2022_05.fa" I0517 16:43:32.603918 140650292352064 run_docker.py:255] I0517 08:43:32.603443 139962212468544 utils.py:36] Started Jackhmmer (mgy_clusters_2022_05.fa) query I0517 16:46:37.409147 140650292352064 run_docker.py:255] I0517 08:46:37.407801 139962212468544 utils.py:40] Finished Jackhmmer (mgy_clusters_2022_05.fa) query in 184.804 seconds I0517 16:46:37.951393 140650292352064 run_docker.py:255] I0517 08:46:37.950919 139962212468544 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpp63aisip/query.a3m -o /tmp/tmpp63aisip/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70" I0517 16:46:38.000887 140650292352064 run_docker.py:255] I0517 08:46:38.000401 139962212468544 utils.py:36] Started HHsearch query I0517 16:46:49.137153 140650292352064 run_docker.py:255] I0517 08:46:49.136655 139962212468544 utils.py:40] Finished HHsearch query in 11.136 seconds I0517 16:46:49.446268 140650292352064 run_docker.py:255] I0517 08:46:49.445886 139962212468544 hhblits.py:128] Launching subprocess "/usr/bin/hhblits -i /mnt/fasta_path_0/8GZ6.fasta -cpu 4 -oa3m /tmp/tmpzq5150hf/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /mnt/uniref30_database_path/UniRef30_2021_03" I0517 16:46:49.497454 140650292352064 run_docker.py:255] I0517 08:46:49.497032 139962212468544 utils.py:36] Started HHblits query I0517 16:48:52.683240 140650292352064 run_docker.py:255] I0517 08:48:52.682735 139962212468544 utils.py:40] Finished HHblits query in 123.186 seconds I0517 16:48:52.695582 140650292352064 run_docker.py:255] I0517 08:48:52.695324 139962212468544 templates.py:878] Searching for template for: QVQLQESGGGLVQAGGSLRLSCAASGRTSSVYNMAWFRQTPGKEREFVAAITGNGGTTLYADSVKGRLTISRGNAKNTVSLQMNVLKPDDTAVYYCAAGGWGKERNYAYWGQGTQVTVSSHHHHHH I0517 16:48:54.830226 140650292352064 run_docker.py:255] I0517 08:48:54.829735 139962212468544 templates.py:267] Found an exact template match 6qd6_C. I0517 16:48:54.838274 140650292352064 run_docker.py:255] I0517 08:48:54.837990 139962212468544 templates.py:267] Found an exact template match 6qd6_G. I0517 16:48:54.930829 140650292352064 run_docker.py:255] I0517 08:48:54.930498 139962212468544 templates.py:267] Found an exact template match 5wts_A. I0517 16:48:55.026873 140650292352064 run_docker.py:255] I0517 08:48:55.026528 139962212468544 templates.py:267] Found an exact template match 6gjs_B. I0517 16:48:55.637419 140650292352064 run_docker.py:255] I0517 08:48:55.636942 139962212468544 templates.py:267] Found an exact template match 6gkd_B. I0517 16:48:55.895157 140650292352064 run_docker.py:255] I0517 08:48:55.894784 139962212468544 templates.py:267] Found an exact template match 6hd8_A. I0517 16:48:56.275886 140650292352064 run_docker.py:255] I0517 08:48:56.275430 139962212468544 templates.py:267] Found an exact template match 6hd9_A. I0517 16:48:56.319466 140650292352064 run_docker.py:255] I0517 08:48:56.319131 139962212468544 templates.py:267] Found an exact template match 6rul_A. I0517 16:48:56.451577 140650292352064 run_docker.py:255] I0517 08:48:56.451225 139962212468544 templates.py:267] Found an exact template match 4pfe_A. I0517 16:48:56.701474 140650292352064 run_docker.py:255] I0517 08:48:56.701014 139962212468544 templates.py:267] Found an exact template match 3sn6_N. I0517 16:48:56.990272 140650292352064 run_docker.py:255] I0517 08:48:56.989793 139962212468544 templates.py:267] Found an exact template match 6pb1_N. I0517 16:48:57.033105 140650292352064 run_docker.py:255] I0517 08:48:57.032745 139962212468544 templates.py:267] Found an exact template match 6rum_A. I0517 16:48:57.086215 140650292352064 run_docker.py:255] I0517 08:48:57.085903 139962212468544 templates.py:267] Found an exact template match 5wb1_A. I0517 16:48:57.110778 140650292352064 run_docker.py:255] I0517 08:48:57.110478 139962212468544 templates.py:267] Found an exact template match 5vm6_A. I0517 16:48:57.186338 140650292352064 run_docker.py:255] I0517 08:48:57.186022 139962212468544 templates.py:267] Found an exact template match 5foj_A. I0517 16:48:57.237214 140650292352064 run_docker.py:255] I0517 08:48:57.236900 139962212468544 templates.py:267] Found an exact template match 5m2w_A. I0517 16:48:57.284224 140650292352064 run_docker.py:255] I0517 08:48:57.283923 139962212468544 templates.py:267] Found an exact template match 5mje_B. I0517 16:48:57.479903 140650292352064 run_docker.py:255] I0517 08:48:57.479458 139962212468544 templates.py:267] Found an exact template match 5vm4_L. I0517 16:48:57.845450 140650292352064 run_docker.py:255] I0517 08:48:57.844991 139962212468544 templates.py:267] Found an exact template match 4cdg_D. I0517 16:48:57.887320 140650292352064 run_docker.py:255] I0517 08:48:57.887031 139962212468544 templates.py:267] Found an exact template match 4gft_B. I0517 16:48:57.988736 140650292352064 run_docker.py:255] I0517 08:48:57.988251 139962212468544 pipeline.py:234] Uniref90 MSA size: 10000 sequences. I0517 16:48:57.988856 140650292352064 run_docker.py:255] I0517 08:48:57.988335 139962212468544 pipeline.py:235] BFD MSA size: 1612 sequences. I0517 16:48:57.988883 140650292352064 run_docker.py:255] I0517 08:48:57.988350 139962212468544 pipeline.py:236] MGnify MSA size: 501 sequences. I0517 16:48:57.988906 140650292352064 run_docker.py:255] I0517 08:48:57.988364 139962212468544 pipeline.py:237] Final (deduplicated) MSA size: 12020 sequences. I0517 16:48:57.988928 140650292352064 run_docker.py:255] I0517 08:48:57.988502 139962212468544 pipeline.py:239] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20. I0517 16:48:58.114603 140650292352064 run_docker.py:255] I0517 08:48:58.113753 139962212468544 run_alphafold.py:191] Running model model_1_pred_0 on 8GZ6 I0517 16:48:59.508914 140650292352064 run_docker.py:255] I0517 08:48:59.508350 139962212468544 model.py:165] Running predict with shape(feat) = {'aatype': (4, 126), 'residue_index': (4, 126), 'seq_length': (4,), 'template_aatype': (4, 4, 126), 'template_all_atom_masks': (4, 4, 126, 37), 'template_all_atom_positions': (4, 4, 126, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 126), 'msa_mask': (4, 508, 126), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 126, 3), 'template_pseudo_beta_mask': (4, 4, 126), 'atom14_atom_exists': (4, 126, 14), 'residx_atom14_to_atom37': (4, 126, 14), 'residx_atom37_to_atom14': (4, 126, 37), 'atom37_atom_exists': (4, 126, 37), 'extra_msa': (4, 5120, 126), 'extra_msa_mask': (4, 5120, 126), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 126), 'true_msa': (4, 508, 126), 'extra_has_deletion': (4, 5120, 126), 'extra_deletion_value': (4, 5120, 126), 'msa_feat': (4, 508, 126, 49), 'target_feat': (4, 126, 22)} I0517 16:48:59.600997 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600504: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:231] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9 I0517 16:48:59.601204 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600544: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas I0517 16:48:59.608552 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608081: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found I0517 16:48:59.608791 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608157: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function I0517 16:48:59.612349 140650292352064 run_docker.py:255] Traceback (most recent call last): I0517 16:48:59.612429 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 432, in <module> I0517 16:48:59.612536 140650292352064 run_docker.py:255] app.run(main) I0517 16:48:59.612603 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I0517 16:48:59.612677 140650292352064 run_docker.py:255] _run_main(main, args) I0517 16:48:59.612745 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I0517 16:48:59.612811 140650292352064 run_docker.py:255] sys.exit(main(argv)) I0517 16:48:59.612883 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 408, in main I0517 16:48:59.612949 140650292352064 run_docker.py:255] predict_structure( I0517 16:48:59.613012 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure I0517 16:48:59.613077 140650292352064 run_docker.py:255] prediction_result = model_runner.predict(processed_feature_dict, I0517 16:48:59.613144 140650292352064 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict I0517 16:48:59.613205 140650292352064 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat) I0517 16:48:59.613268 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/random.py", line 132, in PRNGKey I0517 16:48:59.613330 140650292352064 run_docker.py:255] key = prng.seed_with_impl(impl, seed) I0517 16:48:59.613391 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 267, in seed_with_impl I0517 16:48:59.613450 140650292352064 run_docker.py:255] return random_seed(seed, impl=impl) I0517 16:48:59.613508 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 580, in random_seed I0517 16:48:59.613569 140650292352064 run_docker.py:255] return random_seed_p.bind(seeds_arr, impl=impl) I0517 16:48:59.613629 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.613687 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.613749 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.613809 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.613869 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.613931 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.613995 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 592, in random_seed_impl I0517 16:48:59.614058 140650292352064 run_docker.py:255] base_arr = random_seed_impl_base(seeds, impl=impl) I0517 16:48:59.614119 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base I0517 16:48:59.614181 140650292352064 run_docker.py:255] return seed(seeds) I0517 16:48:59.614244 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 832, in threefry_seed I0517 16:48:59.614307 140650292352064 run_docker.py:255] lax.shift_right_logical(seed, lax_internal._const(seed, 32))) I0517 16:48:59.614370 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical I0517 16:48:59.614432 140650292352064 run_docker.py:255] return shift_right_logical_p.bind(x, y) I0517 16:48:59.614495 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.614555 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.614619 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.614684 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.614765 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.614831 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.614899 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive I0517 16:48:59.614965 140650292352064 run_docker.py:255] return compiled_fun(*args) I0517 16:48:59.615031 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 200, in <lambda> I0517 16:48:59.615100 140650292352064 run_docker.py:255] return lambda *args, **kw: compiled(*args, **kw)[0] I0517 16:48:59.615169 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled I0517 16:48:59.615240 140650292352064 run_docker.py:255] out_flat = compiled.execute(in_flat) I0517 16:48:59.615311 140650292352064 run_docker.py:255] jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
I got exactly same ERROR. Have you solved it, bro? My nividia info is as follow: Fri Nov 17 14:21:52 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | Off | | 0% 41C P8 28W / 450W | 6MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3164 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------+
Unfortunately, I haven't solved it. Recently, I got the same error as #853. I tried to change the Dockerfile, followed https://github.com/google-deepmind/alphafold/issues/764#issuecomment-1679537433, but it didn't work and caused a crash.
I tried to run AlphaFold latest version on a new machine with GPU RTX4090, CUDA version 11.8 (downgraded from 12.2), Ubuntu 22.04 LTS. I used anaconda3 to build the alphafold environment. However, all of the alphafold version >=2.24 showed the same error. I was wondering if anyone could help me to solve this? Please let me know if you need anything else. Thank you soooo much! I've tried this method #646 , but it didn't work. Nividia info:
(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ nvidia-smi Wed May 17 16:49:49 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA Graphics... On | 00000000:01:00.0 Off | Off | | 30% 42C P8 24W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA Graphics... On | 00000000:08:00.0 Off | Off | | 30% 41C P8 19W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Error message:(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ python docker/run_docker.py --fasta_paths=/home/soft/Documents/8GZ6.fasta --max_template_date=3000-01-01 --data_dir=/soft/AF2/download/ --output_dir=/home/soft/Documents/8GZ6/ I0517 16:41:16.903001 140650292352064 run_docker.py:113] Mounting /home/soft/Documents -> /mnt/fasta_path_0 I0517 16:41:16.903073 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref90 -> /mnt/uniref90_database_path I0517 16:41:16.903110 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/mgnify -> /mnt/mgnify_database_path I0517 16:41:16.903137 140650292352064 run_docker.py:113] Mounting /soft/AF2/download -> /mnt/data_dir I0517 16:41:16.903162 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir I0517 16:41:16.903189 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif -> /mnt/obsolete_pdbs_path I0517 16:41:16.903218 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb70 -> /mnt/pdb70_database_path I0517 16:41:16.903246 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref30 -> /mnt/uniref30_database_path I0517 16:41:16.903274 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/bfd -> /mnt/bfd_database_path I0517 16:41:18.367146 140650292352064 run_docker.py:255] I0517 08:41:18.366440 139962212468544 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat. I0517 16:41:18.513591 140650292352064 run_docker.py:255] I0517 08:41:18.513257 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I0517 16:41:18.642552 140650292352064 run_docker.py:255] I0517 08:41:18.642126 139962212468544 xla_bridge.py:353] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA I0517 16:41:18.642671 140650292352064 run_docker.py:255] I0517 08:41:18.642353 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' I0517 16:41:18.642703 140650292352064 run_docker.py:255] I0517 08:41:18.642385 139962212468544 xla_bridge.py:353] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. I0517 16:41:20.699224 140650292352064 run_docker.py:255] I0517 08:41:20.698850 139962212468544 run_alphafold.py:386] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0'] I0517 16:41:20.699338 140650292352064 run_docker.py:255] I0517 08:41:20.698937 139962212468544 run_alphafold.py:403] Using random seed 979532966947835319 for the data pipeline I0517 16:41:20.699380 140650292352064 run_docker.py:255] I0517 08:41:20.699029 139962212468544 run_alphafold.py:161] Predicting 8GZ6 I0517 16:41:20.699409 140650292352064 run_docker.py:255] I0517 08:41:20.699223 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpxzhf05lq/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/uniref90_database_path/uniref90.fasta" I0517 16:41:20.758846 140650292352064 run_docker.py:255] I0517 08:41:20.758264 139962212468544 utils.py:36] Started Jackhmmer (uniref90.fasta) query I0517 16:43:32.400591 140650292352064 run_docker.py:255] I0517 08:43:32.399554 139962212468544 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 131.641 seconds I0517 16:43:32.550053 140650292352064 run_docker.py:255] I0517 08:43:32.549127 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp05y00c2a/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/mgnify_database_path/mgy_clusters_2022_05.fa" I0517 16:43:32.603918 140650292352064 run_docker.py:255] I0517 08:43:32.603443 139962212468544 utils.py:36] Started Jackhmmer (mgy_clusters_2022_05.fa) query I0517 16:46:37.409147 140650292352064 run_docker.py:255] I0517 08:46:37.407801 139962212468544 utils.py:40] Finished Jackhmmer (mgy_clusters_2022_05.fa) query in 184.804 seconds I0517 16:46:37.951393 140650292352064 run_docker.py:255] I0517 08:46:37.950919 139962212468544 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpp63aisip/query.a3m -o /tmp/tmpp63aisip/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70" I0517 16:46:38.000887 140650292352064 run_docker.py:255] I0517 08:46:38.000401 139962212468544 utils.py:36] Started HHsearch query I0517 16:46:49.137153 140650292352064 run_docker.py:255] I0517 08:46:49.136655 139962212468544 utils.py:40] Finished HHsearch query in 11.136 seconds I0517 16:46:49.446268 140650292352064 run_docker.py:255] I0517 08:46:49.445886 139962212468544 hhblits.py:128] Launching subprocess "/usr/bin/hhblits -i /mnt/fasta_path_0/8GZ6.fasta -cpu 4 -oa3m /tmp/tmpzq5150hf/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /mnt/uniref30_database_path/UniRef30_2021_03" I0517 16:46:49.497454 140650292352064 run_docker.py:255] I0517 08:46:49.497032 139962212468544 utils.py:36] Started HHblits query I0517 16:48:52.683240 140650292352064 run_docker.py:255] I0517 08:48:52.682735 139962212468544 utils.py:40] Finished HHblits query in 123.186 seconds I0517 16:48:52.695582 140650292352064 run_docker.py:255] I0517 08:48:52.695324 139962212468544 templates.py:878] Searching for template for: QVQLQESGGGLVQAGGSLRLSCAASGRTSSVYNMAWFRQTPGKEREFVAAITGNGGTTLYADSVKGRLTISRGNAKNTVSLQMNVLKPDDTAVYYCAAGGWGKERNYAYWGQGTQVTVSSHHHHHH I0517 16:48:54.830226 140650292352064 run_docker.py:255] I0517 08:48:54.829735 139962212468544 templates.py:267] Found an exact template match 6qd6_C. I0517 16:48:54.838274 140650292352064 run_docker.py:255] I0517 08:48:54.837990 139962212468544 templates.py:267] Found an exact template match 6qd6_G. I0517 16:48:54.930829 140650292352064 run_docker.py:255] I0517 08:48:54.930498 139962212468544 templates.py:267] Found an exact template match 5wts_A. I0517 16:48:55.026873 140650292352064 run_docker.py:255] I0517 08:48:55.026528 139962212468544 templates.py:267] Found an exact template match 6gjs_B. I0517 16:48:55.637419 140650292352064 run_docker.py:255] I0517 08:48:55.636942 139962212468544 templates.py:267] Found an exact template match 6gkd_B. I0517 16:48:55.895157 140650292352064 run_docker.py:255] I0517 08:48:55.894784 139962212468544 templates.py:267] Found an exact template match 6hd8_A. I0517 16:48:56.275886 140650292352064 run_docker.py:255] I0517 08:48:56.275430 139962212468544 templates.py:267] Found an exact template match 6hd9_A. I0517 16:48:56.319466 140650292352064 run_docker.py:255] I0517 08:48:56.319131 139962212468544 templates.py:267] Found an exact template match 6rul_A. I0517 16:48:56.451577 140650292352064 run_docker.py:255] I0517 08:48:56.451225 139962212468544 templates.py:267] Found an exact template match 4pfe_A. I0517 16:48:56.701474 140650292352064 run_docker.py:255] I0517 08:48:56.701014 139962212468544 templates.py:267] Found an exact template match 3sn6_N. I0517 16:48:56.990272 140650292352064 run_docker.py:255] I0517 08:48:56.989793 139962212468544 templates.py:267] Found an exact template match 6pb1_N. I0517 16:48:57.033105 140650292352064 run_docker.py:255] I0517 08:48:57.032745 139962212468544 templates.py:267] Found an exact template match 6rum_A. I0517 16:48:57.086215 140650292352064 run_docker.py:255] I0517 08:48:57.085903 139962212468544 templates.py:267] Found an exact template match 5wb1_A. I0517 16:48:57.110778 140650292352064 run_docker.py:255] I0517 08:48:57.110478 139962212468544 templates.py:267] Found an exact template match 5vm6_A. I0517 16:48:57.186338 140650292352064 run_docker.py:255] I0517 08:48:57.186022 139962212468544 templates.py:267] Found an exact template match 5foj_A. I0517 16:48:57.237214 140650292352064 run_docker.py:255] I0517 08:48:57.236900 139962212468544 templates.py:267] Found an exact template match 5m2w_A. I0517 16:48:57.284224 140650292352064 run_docker.py:255] I0517 08:48:57.283923 139962212468544 templates.py:267] Found an exact template match 5mje_B. I0517 16:48:57.479903 140650292352064 run_docker.py:255] I0517 08:48:57.479458 139962212468544 templates.py:267] Found an exact template match 5vm4_L. I0517 16:48:57.845450 140650292352064 run_docker.py:255] I0517 08:48:57.844991 139962212468544 templates.py:267] Found an exact template match 4cdg_D. I0517 16:48:57.887320 140650292352064 run_docker.py:255] I0517 08:48:57.887031 139962212468544 templates.py:267] Found an exact template match 4gft_B. I0517 16:48:57.988736 140650292352064 run_docker.py:255] I0517 08:48:57.988251 139962212468544 pipeline.py:234] Uniref90 MSA size: 10000 sequences. I0517 16:48:57.988856 140650292352064 run_docker.py:255] I0517 08:48:57.988335 139962212468544 pipeline.py:235] BFD MSA size: 1612 sequences. I0517 16:48:57.988883 140650292352064 run_docker.py:255] I0517 08:48:57.988350 139962212468544 pipeline.py:236] MGnify MSA size: 501 sequences. I0517 16:48:57.988906 140650292352064 run_docker.py:255] I0517 08:48:57.988364 139962212468544 pipeline.py:237] Final (deduplicated) MSA size: 12020 sequences. I0517 16:48:57.988928 140650292352064 run_docker.py:255] I0517 08:48:57.988502 139962212468544 pipeline.py:239] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20. I0517 16:48:58.114603 140650292352064 run_docker.py:255] I0517 08:48:58.113753 139962212468544 run_alphafold.py:191] Running model model_1_pred_0 on 8GZ6 I0517 16:48:59.508914 140650292352064 run_docker.py:255] I0517 08:48:59.508350 139962212468544 model.py:165] Running predict with shape(feat) = {'aatype': (4, 126), 'residue_index': (4, 126), 'seq_length': (4,), 'template_aatype': (4, 4, 126), 'template_all_atom_masks': (4, 4, 126, 37), 'template_all_atom_positions': (4, 4, 126, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 126), 'msa_mask': (4, 508, 126), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 126, 3), 'template_pseudo_beta_mask': (4, 4, 126), 'atom14_atom_exists': (4, 126, 14), 'residx_atom14_to_atom37': (4, 126, 14), 'residx_atom37_to_atom14': (4, 126, 37), 'atom37_atom_exists': (4, 126, 37), 'extra_msa': (4, 5120, 126), 'extra_msa_mask': (4, 5120, 126), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 126), 'true_msa': (4, 508, 126), 'extra_has_deletion': (4, 5120, 126), 'extra_deletion_value': (4, 5120, 126), 'msa_feat': (4, 508, 126, 49), 'target_feat': (4, 126, 22)} I0517 16:48:59.600997 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600504: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:231] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9 I0517 16:48:59.601204 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600544: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas I0517 16:48:59.608552 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608081: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found I0517 16:48:59.608791 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608157: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function I0517 16:48:59.612349 140650292352064 run_docker.py:255] Traceback (most recent call last): I0517 16:48:59.612429 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 432, in <module> I0517 16:48:59.612536 140650292352064 run_docker.py:255] app.run(main) I0517 16:48:59.612603 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I0517 16:48:59.612677 140650292352064 run_docker.py:255] _run_main(main, args) I0517 16:48:59.612745 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I0517 16:48:59.612811 140650292352064 run_docker.py:255] sys.exit(main(argv)) I0517 16:48:59.612883 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 408, in main I0517 16:48:59.612949 140650292352064 run_docker.py:255] predict_structure( I0517 16:48:59.613012 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure I0517 16:48:59.613077 140650292352064 run_docker.py:255] prediction_result = model_runner.predict(processed_feature_dict, I0517 16:48:59.613144 140650292352064 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict I0517 16:48:59.613205 140650292352064 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat) I0517 16:48:59.613268 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/random.py", line 132, in PRNGKey I0517 16:48:59.613330 140650292352064 run_docker.py:255] key = prng.seed_with_impl(impl, seed) I0517 16:48:59.613391 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 267, in seed_with_impl I0517 16:48:59.613450 140650292352064 run_docker.py:255] return random_seed(seed, impl=impl) I0517 16:48:59.613508 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 580, in random_seed I0517 16:48:59.613569 140650292352064 run_docker.py:255] return random_seed_p.bind(seeds_arr, impl=impl) I0517 16:48:59.613629 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.613687 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.613749 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.613809 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.613869 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.613931 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.613995 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 592, in random_seed_impl I0517 16:48:59.614058 140650292352064 run_docker.py:255] base_arr = random_seed_impl_base(seeds, impl=impl) I0517 16:48:59.614119 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base I0517 16:48:59.614181 140650292352064 run_docker.py:255] return seed(seeds) I0517 16:48:59.614244 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 832, in threefry_seed I0517 16:48:59.614307 140650292352064 run_docker.py:255] lax.shift_right_logical(seed, lax_internal._const(seed, 32))) I0517 16:48:59.614370 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical I0517 16:48:59.614432 140650292352064 run_docker.py:255] return shift_right_logical_p.bind(x, y) I0517 16:48:59.614495 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.614555 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.614619 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.614684 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.614765 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.614831 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.614899 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive I0517 16:48:59.614965 140650292352064 run_docker.py:255] return compiled_fun(*args) I0517 16:48:59.615031 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 200, in <lambda> I0517 16:48:59.615100 140650292352064 run_docker.py:255] return lambda *args, **kw: compiled(*args, **kw)[0] I0517 16:48:59.615169 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled I0517 16:48:59.615240 140650292352064 run_docker.py:255] out_flat = compiled.execute(in_flat) I0517 16:48:59.615311 140650292352064 run_docker.py:255] jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
I got exactly same ERROR. Have you solved it, bro? My nividia info is as follow: Fri Nov 17 14:21:52 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | Off | | 0% 41C P8 28W / 450W | 6MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3164 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------+
Unfortunately, I haven't solved it. Recently, I got the same error as #853. I tried to change the Dockerfile, followed #764 (comment), but it didn't work and caused a crash.
I solved the "chunked” ERROR by upgrading docker, maybe you can try it. But for jaxlib error, I am confused.
I tried to run AlphaFold latest version on a new machine with GPU RTX4090, CUDA version 11.8 (downgraded from 12.2), Ubuntu 22.04 LTS. I used anaconda3 to build the alphafold environment. However, all of the alphafold version >=2.24 showed the same error.
I was wondering if anyone could help me to solve this? Please let me know if you need anything else. Thank you soooo much!
I've tried this method #646 , but it didn't work.
Nividia info: `(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ nvidia-smi Wed May 17 16:49:49 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA Graphics... On | 00000000:01:00.0 Off | Off | | 30% 42C P8 24W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA Graphics... On | 00000000:08:00.0 Off | Off | | 30% 41C P8 19W / 350W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ `
Error message:
(alphafold2.3.1) soft@GPU1:/soft/alphafold-2.3.1$ python docker/run_docker.py --fasta_paths=/home/soft/Documents/8GZ6.fasta --max_template_date=3000-01-01 --data_dir=/soft/AF2/download/ --output_dir=/home/soft/Documents/8GZ6/ I0517 16:41:16.903001 140650292352064 run_docker.py:113] Mounting /home/soft/Documents -> /mnt/fasta_path_0 I0517 16:41:16.903073 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref90 -> /mnt/uniref90_database_path I0517 16:41:16.903110 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/mgnify -> /mnt/mgnify_database_path I0517 16:41:16.903137 140650292352064 run_docker.py:113] Mounting /soft/AF2/download -> /mnt/data_dir I0517 16:41:16.903162 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir I0517 16:41:16.903189 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb_mmcif -> /mnt/obsolete_pdbs_path I0517 16:41:16.903218 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/pdb70 -> /mnt/pdb70_database_path I0517 16:41:16.903246 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/uniref30 -> /mnt/uniref30_database_path I0517 16:41:16.903274 140650292352064 run_docker.py:113] Mounting /soft/AF2/download/bfd -> /mnt/bfd_database_path I0517 16:41:18.367146 140650292352064 run_docker.py:255] I0517 08:41:18.366440 139962212468544 templates.py:857] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat. I0517 16:41:18.513591 140650292352064 run_docker.py:255] I0517 08:41:18.513257 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: I0517 16:41:18.642552 140650292352064 run_docker.py:255] I0517 08:41:18.642126 139962212468544 xla_bridge.py:353] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA I0517 16:41:18.642671 140650292352064 run_docker.py:255] I0517 08:41:18.642353 139962212468544 xla_bridge.py:353] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' I0517 16:41:18.642703 140650292352064 run_docker.py:255] I0517 08:41:18.642385 139962212468544 xla_bridge.py:353] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. I0517 16:41:20.699224 140650292352064 run_docker.py:255] I0517 08:41:20.698850 139962212468544 run_alphafold.py:386] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0'] I0517 16:41:20.699338 140650292352064 run_docker.py:255] I0517 08:41:20.698937 139962212468544 run_alphafold.py:403] Using random seed 979532966947835319 for the data pipeline I0517 16:41:20.699380 140650292352064 run_docker.py:255] I0517 08:41:20.699029 139962212468544 run_alphafold.py:161] Predicting 8GZ6 I0517 16:41:20.699409 140650292352064 run_docker.py:255] I0517 08:41:20.699223 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpxzhf05lq/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/uniref90_database_path/uniref90.fasta" I0517 16:41:20.758846 140650292352064 run_docker.py:255] I0517 08:41:20.758264 139962212468544 utils.py:36] Started Jackhmmer (uniref90.fasta) query I0517 16:43:32.400591 140650292352064 run_docker.py:255] I0517 08:43:32.399554 139962212468544 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 131.641 seconds I0517 16:43:32.550053 140650292352064 run_docker.py:255] I0517 08:43:32.549127 139962212468544 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp05y00c2a/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/fasta_path_0/8GZ6.fasta /mnt/mgnify_database_path/mgy_clusters_2022_05.fa" I0517 16:43:32.603918 140650292352064 run_docker.py:255] I0517 08:43:32.603443 139962212468544 utils.py:36] Started Jackhmmer (mgy_clusters_2022_05.fa) query I0517 16:46:37.409147 140650292352064 run_docker.py:255] I0517 08:46:37.407801 139962212468544 utils.py:40] Finished Jackhmmer (mgy_clusters_2022_05.fa) query in 184.804 seconds I0517 16:46:37.951393 140650292352064 run_docker.py:255] I0517 08:46:37.950919 139962212468544 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpp63aisip/query.a3m -o /tmp/tmpp63aisip/output.hhr -maxseq 1000000 -d /mnt/pdb70_database_path/pdb70" I0517 16:46:38.000887 140650292352064 run_docker.py:255] I0517 08:46:38.000401 139962212468544 utils.py:36] Started HHsearch query I0517 16:46:49.137153 140650292352064 run_docker.py:255] I0517 08:46:49.136655 139962212468544 utils.py:40] Finished HHsearch query in 11.136 seconds I0517 16:46:49.446268 140650292352064 run_docker.py:255] I0517 08:46:49.445886 139962212468544 hhblits.py:128] Launching subprocess "/usr/bin/hhblits -i /mnt/fasta_path_0/8GZ6.fasta -cpu 4 -oa3m /tmp/tmpzq5150hf/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /mnt/bfd_database_path/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /mnt/uniref30_database_path/UniRef30_2021_03" I0517 16:46:49.497454 140650292352064 run_docker.py:255] I0517 08:46:49.497032 139962212468544 utils.py:36] Started HHblits query I0517 16:48:52.683240 140650292352064 run_docker.py:255] I0517 08:48:52.682735 139962212468544 utils.py:40] Finished HHblits query in 123.186 seconds I0517 16:48:52.695582 140650292352064 run_docker.py:255] I0517 08:48:52.695324 139962212468544 templates.py:878] Searching for template for: QVQLQESGGGLVQAGGSLRLSCAASGRTSSVYNMAWFRQTPGKEREFVAAITGNGGTTLYADSVKGRLTISRGNAKNTVSLQMNVLKPDDTAVYYCAAGGWGKERNYAYWGQGTQVTVSSHHHHHH I0517 16:48:54.830226 140650292352064 run_docker.py:255] I0517 08:48:54.829735 139962212468544 templates.py:267] Found an exact template match 6qd6_C. I0517 16:48:54.838274 140650292352064 run_docker.py:255] I0517 08:48:54.837990 139962212468544 templates.py:267] Found an exact template match 6qd6_G. I0517 16:48:54.930829 140650292352064 run_docker.py:255] I0517 08:48:54.930498 139962212468544 templates.py:267] Found an exact template match 5wts_A. I0517 16:48:55.026873 140650292352064 run_docker.py:255] I0517 08:48:55.026528 139962212468544 templates.py:267] Found an exact template match 6gjs_B. I0517 16:48:55.637419 140650292352064 run_docker.py:255] I0517 08:48:55.636942 139962212468544 templates.py:267] Found an exact template match 6gkd_B. I0517 16:48:55.895157 140650292352064 run_docker.py:255] I0517 08:48:55.894784 139962212468544 templates.py:267] Found an exact template match 6hd8_A. I0517 16:48:56.275886 140650292352064 run_docker.py:255] I0517 08:48:56.275430 139962212468544 templates.py:267] Found an exact template match 6hd9_A. I0517 16:48:56.319466 140650292352064 run_docker.py:255] I0517 08:48:56.319131 139962212468544 templates.py:267] Found an exact template match 6rul_A. I0517 16:48:56.451577 140650292352064 run_docker.py:255] I0517 08:48:56.451225 139962212468544 templates.py:267] Found an exact template match 4pfe_A. I0517 16:48:56.701474 140650292352064 run_docker.py:255] I0517 08:48:56.701014 139962212468544 templates.py:267] Found an exact template match 3sn6_N. I0517 16:48:56.990272 140650292352064 run_docker.py:255] I0517 08:48:56.989793 139962212468544 templates.py:267] Found an exact template match 6pb1_N. I0517 16:48:57.033105 140650292352064 run_docker.py:255] I0517 08:48:57.032745 139962212468544 templates.py:267] Found an exact template match 6rum_A. I0517 16:48:57.086215 140650292352064 run_docker.py:255] I0517 08:48:57.085903 139962212468544 templates.py:267] Found an exact template match 5wb1_A. I0517 16:48:57.110778 140650292352064 run_docker.py:255] I0517 08:48:57.110478 139962212468544 templates.py:267] Found an exact template match 5vm6_A. I0517 16:48:57.186338 140650292352064 run_docker.py:255] I0517 08:48:57.186022 139962212468544 templates.py:267] Found an exact template match 5foj_A. I0517 16:48:57.237214 140650292352064 run_docker.py:255] I0517 08:48:57.236900 139962212468544 templates.py:267] Found an exact template match 5m2w_A. I0517 16:48:57.284224 140650292352064 run_docker.py:255] I0517 08:48:57.283923 139962212468544 templates.py:267] Found an exact template match 5mje_B. I0517 16:48:57.479903 140650292352064 run_docker.py:255] I0517 08:48:57.479458 139962212468544 templates.py:267] Found an exact template match 5vm4_L. I0517 16:48:57.845450 140650292352064 run_docker.py:255] I0517 08:48:57.844991 139962212468544 templates.py:267] Found an exact template match 4cdg_D. I0517 16:48:57.887320 140650292352064 run_docker.py:255] I0517 08:48:57.887031 139962212468544 templates.py:267] Found an exact template match 4gft_B. I0517 16:48:57.988736 140650292352064 run_docker.py:255] I0517 08:48:57.988251 139962212468544 pipeline.py:234] Uniref90 MSA size: 10000 sequences. I0517 16:48:57.988856 140650292352064 run_docker.py:255] I0517 08:48:57.988335 139962212468544 pipeline.py:235] BFD MSA size: 1612 sequences. I0517 16:48:57.988883 140650292352064 run_docker.py:255] I0517 08:48:57.988350 139962212468544 pipeline.py:236] MGnify MSA size: 501 sequences. I0517 16:48:57.988906 140650292352064 run_docker.py:255] I0517 08:48:57.988364 139962212468544 pipeline.py:237] Final (deduplicated) MSA size: 12020 sequences. I0517 16:48:57.988928 140650292352064 run_docker.py:255] I0517 08:48:57.988502 139962212468544 pipeline.py:239] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20. I0517 16:48:58.114603 140650292352064 run_docker.py:255] I0517 08:48:58.113753 139962212468544 run_alphafold.py:191] Running model model_1_pred_0 on 8GZ6 I0517 16:48:59.508914 140650292352064 run_docker.py:255] I0517 08:48:59.508350 139962212468544 model.py:165] Running predict with shape(feat) = {'aatype': (4, 126), 'residue_index': (4, 126), 'seq_length': (4,), 'template_aatype': (4, 4, 126), 'template_all_atom_masks': (4, 4, 126, 37), 'template_all_atom_positions': (4, 4, 126, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 126), 'msa_mask': (4, 508, 126), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 126, 3), 'template_pseudo_beta_mask': (4, 4, 126), 'atom14_atom_exists': (4, 126, 14), 'residx_atom14_to_atom37': (4, 126, 14), 'residx_atom37_to_atom14': (4, 126, 37), 'atom37_atom_exists': (4, 126, 37), 'extra_msa': (4, 5120, 126), 'extra_msa_mask': (4, 5120, 126), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 126), 'true_msa': (4, 508, 126), 'extra_has_deletion': (4, 5120, 126), 'extra_deletion_value': (4, 5120, 126), 'msa_feat': (4, 508, 126, 49), 'target_feat': (4, 126, 22)} I0517 16:48:59.600997 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600504: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:231] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.9 I0517 16:48:59.601204 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.600544: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas I0517 16:48:59.608552 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608081: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found I0517 16:48:59.608791 140650292352064 run_docker.py:255] 2023-05-17 08:48:59.608157: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function I0517 16:48:59.612349 140650292352064 run_docker.py:255] Traceback (most recent call last): I0517 16:48:59.612429 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 432, in <module> I0517 16:48:59.612536 140650292352064 run_docker.py:255] app.run(main) I0517 16:48:59.612603 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I0517 16:48:59.612677 140650292352064 run_docker.py:255] _run_main(main, args) I0517 16:48:59.612745 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I0517 16:48:59.612811 140650292352064 run_docker.py:255] sys.exit(main(argv)) I0517 16:48:59.612883 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 408, in main I0517 16:48:59.612949 140650292352064 run_docker.py:255] predict_structure( I0517 16:48:59.613012 140650292352064 run_docker.py:255] File "/app/alphafold/run_alphafold.py", line 199, in predict_structure I0517 16:48:59.613077 140650292352064 run_docker.py:255] prediction_result = model_runner.predict(processed_feature_dict, I0517 16:48:59.613144 140650292352064 run_docker.py:255] File "/app/alphafold/alphafold/model/model.py", line 167, in predict I0517 16:48:59.613205 140650292352064 run_docker.py:255] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat) I0517 16:48:59.613268 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/random.py", line 132, in PRNGKey I0517 16:48:59.613330 140650292352064 run_docker.py:255] key = prng.seed_with_impl(impl, seed) I0517 16:48:59.613391 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 267, in seed_with_impl I0517 16:48:59.613450 140650292352064 run_docker.py:255] return random_seed(seed, impl=impl) I0517 16:48:59.613508 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 580, in random_seed I0517 16:48:59.613569 140650292352064 run_docker.py:255] return random_seed_p.bind(seeds_arr, impl=impl) I0517 16:48:59.613629 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.613687 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.613749 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.613809 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.613869 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.613931 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.613995 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 592, in random_seed_impl I0517 16:48:59.614058 140650292352064 run_docker.py:255] base_arr = random_seed_impl_base(seeds, impl=impl) I0517 16:48:59.614119 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base I0517 16:48:59.614181 140650292352064 run_docker.py:255] return seed(seeds) I0517 16:48:59.614244 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/prng.py", line 832, in threefry_seed I0517 16:48:59.614307 140650292352064 run_docker.py:255] lax.shift_right_logical(seed, lax_internal._const(seed, 32))) I0517 16:48:59.614370 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical I0517 16:48:59.614432 140650292352064 run_docker.py:255] return shift_right_logical_p.bind(x, y) I0517 16:48:59.614495 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 329, in bind I0517 16:48:59.614555 140650292352064 run_docker.py:255] return self.bind_with_trace(find_top_trace(args), args, params) I0517 16:48:59.614619 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 332, in bind_with_trace I0517 16:48:59.614684 140650292352064 run_docker.py:255] out = trace.process_primitive(self, map(trace.full_raise, args), params) I0517 16:48:59.614765 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/core.py", line 712, in process_primitive I0517 16:48:59.614831 140650292352064 run_docker.py:255] return primitive.impl(*tracers, **params) I0517 16:48:59.614899 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive I0517 16:48:59.614965 140650292352064 run_docker.py:255] return compiled_fun(*args) I0517 16:48:59.615031 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 200, in <lambda> I0517 16:48:59.615100 140650292352064 run_docker.py:255] return lambda *args, **kw: compiled(*args, **kw)[0] I0517 16:48:59.615169 140650292352064 run_docker.py:255] File "/opt/conda/lib/python3.8/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled I0517 16:48:59.615240 140650292352064 run_docker.py:255] out_flat = compiled.execute(in_flat) I0517 16:48:59.615311 140650292352064 run_docker.py:255] jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
BRO, I just made it with the CUDA==11.8.0, and nvidia/cuda:${CUDA}-cudnn8-devel-ubuntu22.04 ! Maybe you can try this config as well.
Same jaxlib.xla_extension.XlaRuntimeError: FAILED_PRECONDITION: Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version error
conda install -c nvidia cuda-nvcc