Joonsun Auh

Results 3 comments of Joonsun Auh

I used dp, because ddp is not implemented in linear evaluation XD. So when I tried to use 8 GPUs than error has occurred.

Yes, I did not try to use more than 8 gpus, but 4 gpus are ok. And my batch size is 1024. I tried to use 2048 batch size but...

Then how many times last layer has been trained in step linear evaluation? When I operate the linear evaluation, last layer always has a training step. I can't find the...