localcolabfold
localcolabfold copied to clipboard
colabfold_batch tries to connect to internet even if msa provided as parameter
Hi,
I have sucessfully runned colabfold_search locally using the following command:
colabfold_search \
--threads 32 --use-env 1 --use-templates 1 \
--mmseqs mmseqs \
--db1 /home/jflucier/projects/def-marechal/programs/colabfold_db/uniref30_2302_db \
--db2 /home/jflucier/projects/def-marechal/programs/colabfold_db/pdb100_230517 \
--db3 /home/jflucier/projects/def-marechal/programs/colabfold_db/colabfold_envdb_202108_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa /home/jflucier/projects/def-marechal/programs/colabfold_db /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas
When I run colabfold_batch using the msa, the script still tries to connect to internet:
colabfold_batch \
--use-gpu-relax --amber --num-relax 3 \
--num-models 3 --templates \
--num-recycle 30 --recycle-early-stop-tolerance 0.5 \
--model-type auto \
--data /home/jflucier/projects/def-marechal/colabfold_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas/0.a3m \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test
returns the follwing error:
+ echo 'running colabfold on /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa'
running colabfold on /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa
+ colabfold_batch --use-gpu-relax --amber --num-relax 3 --num-models 3 --templates --num-recycle 30 --recycle-earl
y-stop-tolerance 0.5 --model-type auto --data /home/jflucier/projects/def-marechal/colabfold_db /home/jflucier/pro
jects/def-marechal/programs/localcolabfold_env/test/msas/0.a3m /home/jflucier/projects/def-marechal/programs/local
colabfold_env/test
2024-02-06 11:27:10,448 Running colabfold 1.5.2 (3e99c44eec189ec27f6d120af851adb7ff6aa2a2)
Traceback (most recent call last):
File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/scipy-stack/2020b/lib/python3.8/site-packag
es/urllib3/connection.py", line 159, in _new_conn
conn = connection.create_connection(
File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/scipy-stack/2020b/lib/python3.8/site-packag
es/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/scipy-stack/2020b/lib/python3.8/site-packag
es/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
I have tried passing /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas/
or /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas/0.a3m
and both try returns the same network connection error.
The content of msa folder only has file 0.a3m
Can you please guide me on what I am doing wrong. Based on previous answers you have given me issue #184, colabfold_batch should not connect to network if msa is provided.
Thank you very much for your help
Please upgrade to ColabFold 1.5.5 (latest) to use new --pdb-hit-file
and --local-pdb-path
args. See also https://github.com/sokrypton/ColabFold/issues/563 . If you set --templates
arg but without these two args, ColabFold will try to search templates through the Internet.
But, we are aware that issues are being reported when using local templates for some cases. We are currently working on fixing them.
Hi again,
Thank you very much for your awesome support.
I have rerun colabfold_search and remove template option:
colabfold_search \
--threads 32 --use-env 1 --db-load-mode 0 \
--mmseqs mmseqs \
--db1 /home/jflucier/projects/def-marechal/programs/colabfold_db/uniref30_2302_db \
--db2 /home/jflucier/projects/def-marechal/programs/colabfold_db/pdb100_230517 \
--db3 /home/jflucier/projects/def-marechal/programs/colabfold_db/colabfold_envdb_202108_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa /home/jflucier/projects/def-marechal/programs/colabfold_db /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas3
This produces a3m file (no m8 file). Then I run colabfold_batch (again no template option provided):
colabfold_batch \
--use-gpu-relax --amber --num-relax 3 \
--num-models 3 \
--num-recycle 30 --recycle-early-stop-tolerance 0.5 \
--model-type auto \
--data /home/jflucier/projects/def-marechal/colabfold_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas3/0.a3m \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test
I get exact same error:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /alphafold/alphafold_params_colab_2022-12-06.tar (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x1554683ba340>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
thanks again for your help
Note that this is executed on an HPC system, where the compute node do not have access to the outside world. It would be very beneficial to fix all issues to ensure that it can run without access to an outside network (internet) on HPC systems.
Also have this issue on an HPC system with localcolabfold. All the MSAs are precomputed and I am not using templates.
Hi @allcatsaregrey
I managed to get environment working by rebuilding it from scratch.
Attache is my environment file venv.colabfold.af2.3.2.requirements.txt
Fresh install does not appear to work for me sadly.