Diff-UNet
Diff-UNet copied to clipboard
Stuck in an epoch
When I apply this model to the verse2020 dataset, I get stuck at the ninth epoch every time (it will directly terminate the prompt RuntimeError: DataLoader worker (pid 9063) is killed by signal: killed) When I change the higher performance GPU and CPU, adjust the learning rate and batch, etc., I still get stuck at the ninth epoch, showing that it takes ten hours
Your validation data is so big, you can only validate only a section of all validation data. And you also can modify the DDIM sample step from 10 to 2, which can also improve the speed of inference.