ProtSSN icon indicating copy to clipboard operation
ProtSSN copied to clipboard

Problem when using run_pt.py

Open Melonoyk opened this issue 1 year ago • 3 comments

I met a problem when i testing the pre-train part of your code. I use: bash script/run_pt.sh. to follow your Start Training part in README, and find that the process is blocked at the Epoch 1 /100. Eventually the process will be forcibly killed. I also tried to interrupt the process and found that it stuck at reading the length of dataloader. I wonder if this is due to hardware requirements that don't support pre-training(using RTX 3090), and looking forward to your reply very much.

The output is as follow: image

image

Melonoyk avatar Apr 23 '24 07:04 Melonoyk

Hi, Ruiwen, Sorry for the late reply. I have tested the pre-train code just now (RTX 3090), but I didn't meet any wrong, can you provide more information? Thx image

tyang816 avatar Apr 25 '24 06:04 tyang816

Hi, Yang, Some of the packages i use are not in the same version mentioned in enviroment.yaml which may cause the occurrence of this issue. When i using: conda env create -f environment.yaml, this process will shutdown in the middle. I fix this problem by rewriting the BatchSampler function, cuz i found that the loading of dataset is stuck at loading the first data of dataset into sampler while i dont konw how this issue occurs. At last, thank u for your testing!

image

Melonoyk avatar Apr 25 '24 15:04 Melonoyk

Great! I will check the environment file soon.

tyang816 avatar Apr 28 '24 05:04 tyang816