CASMcode
CASMcode copied to clipboard
VASP Freeze error
Dear CASM developers,
CASM seems can detect the time taken for relaxation loop, and if it beyond a certain threshold, CASM will think VASP is frozen and kill the job?
Anything I can do with it?
> Begin vasp run:
> jobdir: /pscratch/sd/v/voz5005/RuO2/casm/211/srun64/casm/training_data/SCEL1_1_1_1_0_0_0/1/calctype.default/run.0/
> exec: srun -n 256 -c 1 --cpu-bind=cores vasp_std
> Most recent file output (std.out): 9.042484521865845 seconds ago.
> Most recent file output (OUTCAR): 11.093905210494995 seconds ago.
> Most recent file output (OUTCAR): 9.144774436950684 seconds ago.
> Most recent file output (OUTCAR): 14.193707704544067 seconds ago.
> Most recent file output (OUTCAR): 13.24203872680664 seconds ago.
> Most recent file output (OUTCAR): 46.290996074676514 seconds ago.
> slowest_loop: 8.0395
> 5.0*slowest_loop: 40.197500000000005
> most_recent: 46.290996074676514
> VASP is frozen, killing job
> Run complete
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File "/global/homes/v/voz5005/.conda/envs/casm/lib/python3.9/site-packages/casm/vaspwrapper/relax.py", line 122, in run
> super(Relax, self).run()
> File "/global/homes/v/voz5005/.conda/envs/casm/lib/python3.9/site-packages/casm/vaspwrapper/vasp_calculator_base.py", line 542, in run
> (status, task) = calculation.run()
> File "/global/homes/v/voz5005/.conda/envs/casm/lib/python3.9/site-packages/casm/vasp/relax.py", line 334, in run
> self.add_errdir()
> File "/global/homes/v/voz5005/.conda/envs/casm/lib/python3.9/site-packages/casm/vasp/relax.py", line 138, in add_errdir
> os.rename(self.rundir[-1],
> OSError: [Errno 22] Invalid argument: '/pscratch/sd/v/voz5005/RuO2/casm/211/srun64/casm/training_data/SCEL1_1_1_1_0_0_0/1/calctype.default/run.0/' -> '/pscratch/sd/v/voz5005/RuO2/casm/211/srun64/casm/training_data/SCEL1_1_1_1_0_0_0/1/calctype.default/run.0/_err.0'
> Found errors: FreezeError
I don't think there is a configuration option from the calc.json file, but you can edit the conditions for which the error is detected in the function implemented here. It's in the .../site-packages/casm/vasp/error.py file.
I made this change in my branch if you want to copy it: https://github.com/xivh/CASMpython/commit/e6505fcf4746c4074a5de670824855690b63c138