ColabFold
ColabFold copied to clipboard
Amber works in AlphaFold2_mmseqs2 but not in AlphaFold2_batch
When I try to run the predefined example sequence (PIAQIHILEGRSDEQKETLIREVSEAISRSLDAPLTSVRVIITEMAKGHFGIGGELASK) with templates, one model and amber it works in AlphaFold2_mmseqs2, but fails in AlphaFold2_batch with the following error:
ValueError Traceback (most recent call last)
<ipython-input-3-a76dac23e0b1> in <module>()
391 Ls=[len(query_sequence)], crop_len=crop_len,
392 model_params=model_params, use_model=use_model,
--> 393 do_relax=use_amber)
394
395 # gather MSA info
<ipython-input-3-a76dac23e0b1> in predict_structure(prefix, feature_dict, Ls, crop_len, model_params, use_model, do_relax, random_seed)
276 stiffness=10.0,exclude_residues=[],
277 max_outer_iterations=20)
--> 278 relaxed_pdb_str, _, _ = amber_relaxer.process(prot=unrelaxed_protein)
279 relaxed_pdb_lines.append(relaxed_pdb_str)
280
/content/alphafold/relax/relax.py in process(self, prot)
62 tolerance=self._tolerance, stiffness=self._stiffness,
63 exclude_residues=self._exclude_residues,
---> 64 max_outer_iterations=self._max_outer_iterations)
65 min_pos = out['pos']
66 start_pos = out['posinit']
/content/alphafold/relax/amber_minimize.py in run_pipeline(prot, stiffness, max_outer_iterations, place_hydrogens_every_iteration, max_iterations, tolerance, restraint_set, max_attempts, checks, exclude_residues)
459 # `protein.to_pdb` will strip any poorly-defined residues so we need to
460 # perform this check before `clean_protein`.
--> 461 _check_residues_are_well_defined(prot)
462 pdb_string = clean_protein(prot, checks=checks)
463
/content/alphafold/relax/amber_minimize.py in _check_residues_are_well_defined(prot)
139 """Checks that all residues contain non-empty atom sets."""
140 if (prot.atom_mask.sum(axis=-1) == 0).any():
--> 141 raise ValueError("Amber minimization can only be performed on proteins with"
142 " well-defined residues. This protein contains at least"
143 " one residue with no atoms.")
ValueError: Amber minimization can only be performed on proteins with well-defined residues. This protein contains at least one residue with no atoms.
The same thing is happening to me with a custom MSA. Amber works in the af_mmseqs2 notebook, but fails in the af_batch notebook.
Error message when running batch inference:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/colabfold_env/bin/colabfold_batch", line 8, in <module>
sys.exit(main())
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/colabfold/batch.py", line 856, in main
recompile_all_models=args.recompile_all_models,
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/colabfold/batch.py", line 703, in run
stop_at_score=stop_at_score,
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/colabfold/batch.py", line 236, in predict_structure
relaxed_pdb_str, _, _ = amber_relaxer.process(prot=unrelaxed_protein)
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/relax/relax.py", line 62, in process
max_outer_iterations=self._max_outer_iterations)
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/relax/amber_minimize.py", line 482, in run_pipeline
ret.update(get_violation_metrics(prot))
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/relax/amber_minimize.py", line 356, in get_violation_metrics
structural_violations, struct_metrics = find_violations(prot)
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/relax/amber_minimize.py", line 343, in find_violations
"clash_overlap_tolerance": 1.5, # Taken from model config.
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/model/folding.py", line 773, in find_structural_violations
bond_length_tolerance_factor=config.violation_tolerance_factor)
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/common/residue_constants.py", line 861, in make_atom14_dists_bounds
residue_bonds, residue_virtual_bonds, _ = load_stereo_chemical_props()
File "/home/ubuntu/anaconda3/envs/colabfold_env/lib/python3.7/site-packages/alphafold/common/residue_constants.py", line 409, in load_stereo_chemical_props
with open(stereo_chemical_props_path, 'rt') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'stereo_chemical_props.txt'
Got it to work by fixing three things:
- Install 'pdbfixer' with
conda install -c conda-forge pdbfixer - Manually download "stereo_chemical_props.txt" with
wget -q https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt --no-check-certificateand link its abs path to https://github.com/sokrypton/ColabFold/blob/main/colabfold/batch.py#L215 - Fix bug in https://github.com/sokrypton/ColabFold/blob/main/colabfold/batch.py#L259 -> Should be
relaxed_pdb_linesnotunrelaxed_pdb_lines