alphafold icon indicating copy to clipboard operation
alphafold copied to clipboard

difference of results between locally run alphafold and alphafold colab

Open danny2551515 opened this issue 1 year ago • 9 comments

Hello. I am making structural predictions using the colab version and the local version. I have confirmed that there is a big difference between the two versions. To solve this difference, I run alphafold locally with -db_preset=reduced_dbs and max_template_date=1000-01-01. But there is still a big difference. Is it possible to get similar results from the alphafold to the alphafold colab version?

danny2551515 avatar Sep 26 '22 02:09 danny2551515

Hi yes its possible to get different results as they are different environments and the model is stochastic. Could you please provide full reproducibility details about your run? Also, have you updated to the latest version?

Htomlinson14 avatar Sep 26 '22 10:09 Htomlinson14

Hello Htomlinson14

Sorry about the late response.

I normally use AlphaFold V.2.2.2, and after I checked your comment, I used V.2.2.4, the latest version, for structural prediction.

I carried out the prediction under these four conditions:

  1. use full DB and templates

python3 /data/AF/alphafold/docker/run_docker.py
--fasta_paths=${fasta_file_path}
--max_template_date=3000-01-01
--model_preset=monomer_ptm
--data_dir=/data/AF/AFDB/
--docker_user=0
--gpu_devices=0
--output_dir=/data/result/

  1. use reduced DB and templates

python3 /data/AF/alphafold/docker/run_docker.py
--fasta_paths=${fasta_file_path}
--max_template_date=3000-01-01
--model_preset=monomer_ptm
--data_dir=/data/AF/AFDB/
--docker_user=0
--gpu_devices=0
--output_dir=/data/result/
--db_preset=reduced_dbs

  1. use full DB and disable templates

python3 /data/AF/alphafold/docker/run_docker.py
--fasta_paths=${fasta_file_path}
--max_template_date=1000-01-01
--model_preset=monomer_ptm
--data_dir=/data/AF/AFDB/
--docker_user=0
--gpu_devices=0
--output_dir=/data/result/

  1. use reduced DB and disable templates

python3 /data/AF/alphafold/docker/run_docker.py
--fasta_paths=${fasta_file_path}
--max_template_date=1000-01-01
--model_preset=monomer_ptm
--data_dir=/data/AF/AFDB/
--docker_user=0
--gpu_devices=0
--output_dir=/data/result/
--db_preset=reduced_dbs

Under these four conditions, the five predictions I obtained respectively from both V.2.2.2 and V.2.2.4 were very different from the predictions from AlphaFold colab.

Are you aware of the factors behind these differences occuring?

Like I asked earlier, how can I obtain similar predictions from local AlphaFold and AlphaFold Colab?

danny2551515 avatar Sep 28 '22 06:09 danny2551515

Hi can you please report the machine (GPU) you are using and also the iptm+ptm results for each of these runs?

Htomlinson14 avatar Sep 28 '22 11:09 Htomlinson14

You may also wish to follow the discussions on https://github.com/deepmind/alphafold/issues/597

Htomlinson14 avatar Sep 28 '22 13:09 Htomlinson14

The average plddt value from the AlphaFold colab was as follows:

ver.alphafold colab

avg_plddt = 67.1059722222222

The machine that has AlphaFold installed uses Quadro RTX 8000. In AlphaFold V.2.2.2 and V.2.2.4, the predictions we obtained under the following four conditions are as follows:

1.full db & templates = 3000-01-01 ver.2.2.4 "plddts": { "model_1_ptm_pred_0": 97.25193926550712, "model_2_ptm_pred_0": 97.18683996607362, "model_3_ptm_pred_0": 54.927574842232, "model_4_ptm_pred_0": 58.54845161102359, "model_5_ptm_pred_0": 43.06532542208311 },

ver.2.2.2 "plddts": { "model_1_ptm_pred_0": 97.36867929339586, "model_2_ptm_pred_0": 97.21677934497643, "model_3_ptm_pred_0": 50.224661763080945, "model_4_ptm_pred_0": 39.979508179484505, "model_5_ptm_pred_0": 44.549697156675734 }

2.full db & templates = 1000-01-01 ver.2.2.4 "plddts": { "model_1_ptm_pred_0": 43.664334482582525, "model_2_ptm_pred_0": 48.08251340941516, "model_3_ptm_pred_0": 61.68443432122087, "model_4_ptm_pred_0": 48.93973191788411, "model_5_ptm_pred_0": 39.43275963009237 }

ver.2.2.2 "plddts": { "model_1_ptm_pred_0": 58.981938345566, "model_2_ptm_pred_0": 46.08298427146891, "model_3_ptm_pred_0": 58.97027318035094, "model_4_ptm_pred_0": 55.71557161934546, "model_5_ptm_pred_0": 44.21769917365889 }

3.reduced db & templates = 3000-01-01 ver.2.2.4 "plddts": { "model_1_ptm_pred_0": 97.26357937406594, "model_2_ptm_pred_0": 97.13925092424562, "model_3_ptm_pred_0": 40.45697902340831, "model_4_ptm_pred_0": 41.859432024373355, "model_5_ptm_pred_0": 40.82802636714861 },

ver.2.2.2 "plddts": { "model_1_ptm_pred_0": 97.46789556103246, "model_2_ptm_pred_0": 97.30054276924895, "model_3_ptm_pred_0": 53.98386122906149, "model_4_ptm_pred_0": 45.08604272171022, "model_5_ptm_pred_0": 38.54566059335381 }

4.reduced db & templates = 1000-01-01 ver.2.2.4 "plddts": { "model_1_ptm_pred_0": 48.53682881088287, "model_2_ptm_pred_0": 49.97295933519744, "model_3_ptm_pred_0": 55.95563452058329, "model_4_ptm_pred_0": 54.111995563263015, "model_5_ptm_pred_0": 42.91102593340323 }

ver.2.2.2 "plddts": { "model_1_ptm_pred_0": 43.48648526660625, "model_2_ptm_pred_0": 52.457455662168826, "model_3_ptm_pred_0": 48.408318636112455, "model_4_ptm_pred_0": 61.21301963988726, "model_5_ptm_pred_0": 52.85276750489996 }

danny2551515 avatar Sep 30 '22 06:09 danny2551515

Hi - thanks very much for providing these values. I can't say exactly what is happening here but a few things to consider:

  • The dictionary keys say e.g. "model_1_ptm_pred_0" but are listed as plddt. Are these definitely pLDDT and not pTM?
  • It seems that a newer template DB is being used than what is searched against in the colab. This could be why the results for local runs with templates are much higher, as there could be near exact hits. We don't recommend turning off templates to achieve reproducibility
  • Predictions with low confidence (40-60 pLDDT is fairly low) are often highly variable, so its hard to compare results with high precision in this range.

Htomlinson14 avatar Sep 30 '22 15:09 Htomlinson14

Yes, when I checked the Ranking_debug.json file, it was marked as plddt. Should I check the ptm value in the pkl file and tell you?

danny2551515 avatar Sep 30 '22 15:09 danny2551515

Ok cool. I think in this case the second and third bullets above are most relevant, particularly the comments on the potential impact of newer templates. Thanks!

Htomlinson14 avatar Sep 30 '22 15:09 Htomlinson14

Hi @Htomlinson14 .

I repeated attempts to match the results of local version and colab version.

I confirmed advice from Issue #126 to disable HHBlits on UniClust in addition to db-preset and templates.

Could you tell me how to disable HHBlits?

danny2551515 avatar Oct 10 '22 23:10 danny2551515