openfold
openfold copied to clipboard
a big gap between the pdb results from openfold and alphafold2 (colab)
Hi! After running openfold according to the default configuration, I found that there was a big difference between the output pdb file of openfold and alphafold (colab version). After comparing the real structural data in the laboratory, I found that the results of alphafold were accurate. As shown below, I basically used the default configuration. What reasons might cause this to happen? I would be very grateful for your answers.
My running instructions:
python3 run_pretrained_openfold.py \
run_fasta \
data/pdb_mmcif/mmcif_files/ \
--uniref90_database_path data/uniref90/uniref90.fasta \
--mgnify_database_path data/mgnify/mgy_clusters_2018_12.fa \
--pdb70_database_path data/pdb70/pdb70 \
--uniclust30_database_path data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--output_dir ./ \
--bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--model_device "cuda:1" \
--jackhmmer_binary_path lib/conda/envs/openfold_venv/bin/jackhmmer \
--hhblits_binary_path lib/conda/envs/openfold_venv/bin/hhblits \
--hhsearch_binary_path lib/conda/envs/openfold_venv/bin/hhsearch \
--kalign_binary_path lib/conda/envs/openfold_venv/bin/kalign \
--config_preset "model_1_ptm" \
--openfold_checkpoint_path openfold/resources/openfold_params/finetuning_ptm_2.pt
My output:
INFO:/data/xx/openfold/openfold/utils/script_utils.py:Loaded OpenFold parameters at openfold/resources/openfold_params/finetuning_ptm_2.pt...
INFO:/data/xx/openfold/run_pretrained_openfold.py:Using precomputed alignments for at1 at ./alignments...
INFO:/data/xx/openfold/openfold/utils/script_utils.py:Running inference for at1...
INFO:/data/xx/openfold/openfold/utils/script_utils.py:Inference time: 20.264450896997005
INFO:/data/xx/openfold/run_pretrained_openfold.py:Output written to ./predictions/at1_model_1_ptm_unrelaxed.pdb...
INFO:/data/xx/openfold/run_pretrained_openfold.py:Running relaxation on ./predictions/at1_model_1_ptm_unrelaxed.pdb...
WARNING:root:Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
INFO:/data/xx/openfold/openfold/utils/script_utils.py:Relaxation time: 14.143211493967101
INFO:/data/xx/openfold/openfold/utils/script_utils.py:Relaxed output written to ./predictions/at1_model_1_ptm_relaxed.pdb...
@Melo-1017 Hello! Have you found a solution of your problem?
How did you generate alignments for "Using precomputed alignments for at1 at ./alignments...
"?
By run_pretrained_openfold.py
or not?
I encountered the same issue when running multimer inference for the given example, 2q2k, using openfold-multimer. It didn’t perform as well as the alphafold-multimer (colab version). I used the alignments provided here: https://github.com/aqlaboratory/openfold/tree/main/tests/test_data/alignments.
My running scripts:
python run_pretrained_openfold.py tests/test_data/2q2k/ data/pdb_mmcif/mmcif_files/ --uniref90_database_path data/uniref90/uniref90.fasta --mgnify_database_path data/mgniy/mgy_clusters_2022_05.fa --pdb_seqres_database_path data/pdb_seqres/pdb_seqres.txt --uniref30_database_path data/uniref30/UniRef30_2021_03 --uniprot_database_path data/uniprot/uniprot.fasta --bfd_database_path databfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --jackhmmer_binary_path $CONDA_PREFIX/bin/jackhmmer --hhblits_binary_path $CONDA_PREFIX/bin/hhblits --hmmsearch_binary_path $CONDA_PREFIX/bin/hmmsearch --hmbuild_binary_path $CONDA_PREFIX/bin/hmmbuild --kalign_binary_path $CONDA_PREFIX/bin/kalign --config_preset "model_1_multimer_v3" --model_device "cuda:0" --use_precomputed_alignments tests/test_data/2q2k/out/alignments/ --output_dir tests/test_data/2q2k/out
I encountered the same issue when running multimer inference for the given example, 2q2k, using openfold-multimer. It didn’t perform as well as the alphafold-multimer (colab version). I used the alignments provided here: https://github.com/aqlaboratory/openfold/tree/main/tests/test_data/alignments.
My running scripts:
python run_pretrained_openfold.py tests/test_data/2q2k/ data/pdb_mmcif/mmcif_files/ --uniref90_database_path data/uniref90/uniref90.fasta --mgnify_database_path data/mgniy/mgy_clusters_2022_05.fa --pdb_seqres_database_path data/pdb_seqres/pdb_seqres.txt --uniref30_database_path data/uniref30/UniRef30_2021_03 --uniprot_database_path data/uniprot/uniprot.fasta --bfd_database_path databfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --jackhmmer_binary_path $CONDA_PREFIX/bin/jackhmmer --hhblits_binary_path $CONDA_PREFIX/bin/hhblits --hmmsearch_binary_path $CONDA_PREFIX/bin/hmmsearch --hmbuild_binary_path $CONDA_PREFIX/bin/hmmbuild --kalign_binary_path $CONDA_PREFIX/bin/kalign --config_preset "model_1_multimer_v3" --model_device "cuda:0" --use_precomputed_alignments tests/test_data/2q2k/out/alignments/ --output_dir tests/test_data/2q2k/out
Hi @wtq18
In your command I couldn't find any path pointing to either AlphaFold's neural network weight or your own pre-retrained OpenFold Multimer checkpoint file. I'm afraid you basically have modelled using a randomly initialised neural net, which naturally won't yield any good results.
Yours Dingquan
According to the script, I'm using this weight openfold/resources/params/params_model_1_multimer_v3.npz. Or can you provide the prediction result in the example folder? if args.jax_param_path is None and args.openfold_checkpoint_path is None: args.jax_param_path = os.path.join( "openfold", "resources", "params", "params_" + args.config_preset + ".npz" )