DHS-LLM-Workshop
DHS-LLM-Workshop copied to clipboard
Eval is like running forever
Hello author,
Thanks for your tutorial.
I am using the dataset hf-codegen-v2 which has 370k rows.
The validation set is about 1850. The batch size is 4. For other params, they are the same as the ones in run_peft.sh.
The training speed is normal but the eval loop is running forever.
Below is the log for evaluation: {'eval_loss': 0.1817416101694107, 'eval_runtime': 6666.0306, 'eval_samples_per_second': 8.072, 'eval_steps_per_second': 2.018, 'epoch': 0.5}
I am not sure if this is normal.
Any help will be appreciated!
ZD