consistency_models
consistency_models copied to clipboard
Why does the training not stop?
Reduce total_training_steps.Then,Why does the training not stop when the steps are reduced
I found the reason that you wonder. In the run_loop function,
def run_loop(self):
saved = False
while (
not self.lr_anneal_steps
or self.step < self.lr_anneal_steps
or self.global_step < self.total_training_steps
):
batch, cond = next(self.data)
self.run_step(batch, cond)
saved = False
if (
self.global_step
and self.save_interval != -1
and self.global_step % self.save_interval == 0
):
self.save()
saved = True
th.cuda.empty_cache()
# Run for a finite amount of time in integration tests.
if os.environ.get("DIFFUSION_TRAINING_TEST", "") and self.step > 0:
return
if self.global_step % self.log_interval == 0:
logger.dumpkvs()
The condition not self.lr_anneal_steps
always evaluates to True
if lr_anneal_steps
is left at its default value of 0.
You can temporarily fix the issue by removing not self.lr_anneal_steps or self.step < self.lr_anneal_steps
.