esm
esm copied to clipboard
Linux process status is D
After running the fold.py script for 3 hours, the process is in sleeping status with status D reported by shell command Top. I am not sure the really cause of the bug, but here is an assumption.
In fold.py script, may be due to the implementation of data_reader is an iterater rather than the DataLoader of Pytorch. The heavy I/O operations make the process stucked. While the extract.py used the DataLoader of Pytorch for loading the sequences at once.
Here is an example code for resumable prediction if anyone need it. Put this code block after 134 line in scripts/fold.py.
if os.path.exists(args.pdb):
predicted_proteins = set()
for filename in os.listdir(args.pdb):
if filename.endswith(".pdb"):
protein_name = filename[:-4]
predicted_proteins.add(protein_name)
all_sequences = [(name, sequence) for (name, sequence) in all_sequences if name not in predicted_proteins]