ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

Unable to resolve runtime symbol: `__extendhfsf2'

Open j3mdamas opened this issue 1 year ago • 2 comments

Steps to Reproduce (for bugs)

This happens with any input, even the examples provided on the README from the local install. About my install, see the section below. If this is an issue because of running on a VM and can never work there, just let me know and we close the ticket.

ColabFold Output (for bugs)

2023-06-22 09:06:53,582 Running colabfold 1.5.2 (3e99c44eec189ec27f6d120af851adb7ff6aa2a2)
2023-06-22 09:06:53,588 non-fasta/a3m file in input directory: test_a3m/cite.bibtex
2023-06-22 09:06:53,588 non-fasta/a3m file in input directory: test_a3m/config.json
2023-06-22 09:06:53,588 non-fasta/a3m file in input directory: test_a3m/log.txt
2023-06-22 09:06:53.642616: W external/org_tensorflow/tensorflow/tsl/platform/default/dso_loader.cc:66] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /progs/all/opensource/gcc/12.1.0/lib64
2023-06-22 09:06:53.642647: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
2023-06-22 09:06:53,643 WARNING: no GPU detected, will be using CPU
2023-06-22 09:06:55,993 Found 4 citations for tools or databases
2023-06-22 09:06:55,993 Query 1/1: test (length 122)
2023-06-22 09:06:55,994 Setting max_seq=1, max_extra_seq=1
2023-06-22 09:07:22.106463: E external/org_tensorflow/tensorflow/compiler/xla/service/cpu/simple_orc_jit.cc:211] Unable to resolve runtime symbol: `__extendhfsf2'.  Hint: if the symbol a custom call target, make sure you've registered it with the JIT using XLA_CPU_REGISTER_CUSTOM_CALL_TARGET.
JIT session error: Symbols not found: [ __extendhfsf2 ]
Segmentation fault

Context

I explain the context in the environment

Your Environment

About the machine, the error is only found on a VM. When I run the same installation on a physical machine, the issue does not occur (either with or without GPU available). But since I use VMs to test setups, it would be good to have a working testing environment. The VM is a CentOS 7 running on VirtualBox. About the installation, it is important for me to control what I am installing, so I am using this as inspiration: https://github.com/YoshitakaMo/localcolabfold/blob/a4455b1086671549ad41e3ba2b4f01ba5815d590/install_colabbatch_linux.sh, but I am doing my own script, not running that exact shell script. I am using that commit, because I want to pin-point to a release version of ColabFold (1.5.2, in this case), and that commit seems to be the one that implements it. My conda environment.yml file:

channels:
  - conda-forge
  - bioconda
dependencies:
  - python=3.9
  - cudnn==8.2.1.32
  - cudatoolkit==11.1.1
  - openmm==7.5.1
  - pdbfixer
  - kalign2=2.04
  - hhsuite=3.3.0
  - mmseqs2=14.7e284

and my pip requirements.txt:

colabfold[alphafold-minus-jax] @ git+https://github.com/sokrypton/[email protected]
https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.3.25+cuda11.cudnn82-cp39-cp39-manylinux2014_x86_64.whl
jax==0.3.25
chex==0.1.6
biopython==1.79

j3mdamas avatar Jun 22 '23 07:06 j3mdamas

I tracked down what's going on (see the linked issue above for details). In short: VirtualBox and Jax are interacting badly.

I have a horribly hacky solution, if you really want to run ColabFold in VirtualBox on a CPU, which I really don't recommend. Let me know if you want it.

If you really want to stick to CPU-VMs, qemu 7.2 or newer should work in theory (I didn't test).

milot-mirdita avatar Nov 06 '23 06:11 milot-mirdita

@milot-mirdita thanks for looking into it, I appreciate it. It's not a blocker for me, I just use VMs to test setups before deploying to physical machines. I'm happy with VirtualBox for most of my setups and this is more of a nice-to-have than anything, so feel free to close it if you deem it not important.

j3mdamas avatar Nov 06 '23 09:11 j3mdamas