big_vision
big_vision copied to clipboard
The ratio of reduced memory with sigmoid loss
After changing the softmax loss to sigmoid loss, there was no significant reduction in gpus's memory (perhaps only slightly). Could you please provide some approximate numerical conclusions so that I can confirm if I was wrong, as your paper did not specify the percentage reduction in memory.