quaterion
quaterion copied to clipboard
Training error using multiple GPUs
Problem
Error using multiple GPUs on the pl trainer.
Using the example available here https://github.com/qdrant/quaterion/blob/master/examples/train_cifar100.py and setting the devices
param of pl.Trainer > 1, I received an error:
File "/home/stefano/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/quaterion/dataset/similarity_data_loader.py", line 271, in <listcomp> labels = {"groups": torch.LongTensor([record.group for record in batch])} AttributeError: 'tuple' object has no attribute 'group'
The error is present even with PairSimilarityDataLoader and using different training strategies (dp, or ddp)