ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

Problems with compressing results in AlphaFold2_batch and when checking that a job is completed

Open EnzoAndree opened this issue 2 years ago • 2 comments

Hello team!

First of all, AlphaFold2_batch.ipynb does not compress the results once it finishes in the colab version. I think at some point that functionality was lost. or is it just me?

Also, I think I found a bug. When AlphaFold2_batch tries to check if a job was already completed on line 566 it tries to find the file with the following name f"{jobname}_unrelaxed_model_{num_models}.pdb", however now the output format is as follows f"{jobname}_unrelaxed_model_{num_models}_rank_{bestrank}.pdb", so the file does not exist and the job is analysed again.

A possible solution could be the following:

last_pdb_file = f"{jobname}_unrelaxed_model_{num_models}_rank*.pdb"
# this generates a list of all files that match the regular expression. if there are no files, the result is an empty list
pdb_file = list(result_dir.glob(last_pdb_file))
if keep_existing_results and len(pdb_file) != 0:
    logger.info(f"Skipping {jobname} (pdb)")
    continue

I do not make a pull request because there may be a better solution.

Best regards!

EnzoAndree avatar Nov 16 '21 04:11 EnzoAndree

Thank you for reporting this and the fix! Also you are right, we stored zip files before and now the files are unzipped. @konstin is currently working on this issue.

martin-steinegger avatar Nov 17 '21 08:11 martin-steinegger

I've implemented this again, until we merge this into main you can try it at https://colab.research.google.com/github/konstin/ColabFold/blob/main/batch/AlphaFold2_batch.ipynb

konstin avatar Nov 20 '21 16:11 konstin