Ranger21 icon indicating copy to clipboard operation
Ranger21 copied to clipboard

resuming training with ranger21?

Open neuronflow opened this issue 2 years ago • 3 comments

As I learned ranger21 does internal lr scheduling etc.

How should training be resumed? Is there a state dict to be loaded etc.?

neuronflow avatar Jul 07 '21 07:07 neuronflow

Hi @neuronflow, Thanks for opening the issue! Ranger21 does maintain a basic state dict but for sure we need to update it with some additional data to ensure a clean restart if training is stopped. Let me use this issue to track it and I'll test and fix it ideally in the next few days as this has been on my todo list.

lessw2020 avatar Jul 08 '21 00:07 lessw2020

any updates on this one? :) I lost multiple GPU days of training because the trainings are non resumable :/

neuronflow avatar Nov 02 '21 19:11 neuronflow

Seconding the need for this feature!

Elevory avatar Jul 04 '22 03:07 Elevory