donut
donut copied to clipboard
Training on cpu
Hi! There is a small error that disables training on the CPU. in file lightning_module.py https://github.com/clovaai/donut/blob/master/lightning_module.py#L121 causes ZeroDision if there is no GPU on the machine. I made a quick fix, but probably there is a better way.
device_count = torch.cuda.device_count() if torch.cuda.device_count() != 0 else 1
max_iter = (self.config.max_epochs * self.config.num_training_samples_per_epoch) / (
self.config.train_batch_sizes[0] * device_count * self.config.get("num_nodes", 1)
)
This works for me :)