ISDA-for-Deep-Networks
ISDA-for-Deep-Networks copied to clipboard
Questions about ISDA for ImageNet
Thank you for your great work!!!
Sorry to bother you, I am confused on the implementation of ISDA for ImageNet, as shown in the following image.
Why do you only consider the diagonal elements of sigma2?
It could be this implementation ignores the covariance between deep features.
Indeed, we ignore the covariance between deep features on ImageNet (only ImageNet) to save the GPU memory. Take ResNet-50 for example. If we consider all the covariance, we will have a 1000 * 2048 * 2048 tensor, while in the current implementation, we only need to process a 1000 * 2048 tensor. We find that ISDA significantly improves the accuracy even with this implementation technique.