training_extensions
training_extensions copied to clipboard
Decreased performance with SupCon Loss in Incremental Learning Research Framework
Hi OTX team,
Our results show that using Supervised Contrastive Loss shows decreased performance. I discussed this with Intel Labs, who noted that this might be caused by two things. First, is the pretrained base model used in OTX pretrained with the SupCon loss or is this just the normal pretrained base that used (e.g.) CrossEntropy? Second, the SupCon approach heavily depends on strong augmentations but the augmentations in OTX differ from the ones used in the FRE paper. Also, it is unclear what the difference in augmentations is for the SupCon and normal implementation in OTX, especially for the AugMix module.
Augmentations when using SupCon: https://github.com/openvinotoolkit/training_extensions/blob/develop/otx/algorithms/classification/configs/base/data/supcon/data_pipeline.py
Augmentations when using normal classification (cross entropy): https://github.com/openvinotoolkit/training_extensions/blob/develop/otx/algorithms/classification/configs/base/data/data_pipeline.py
@sungchul2 , could you take a look this?
Thanks for your interest, @Daankrol
- The latter. Unlike the original SupCon, which pre-trains the model with label information, our SupCon targets a fully fine-tuning scheme using both cross-entropy loss and contrastive loss, like multi-task learning.
- Several experiments showed that strong augmentations used in the original weren't suitable for fine-tuning the model and that the normal pipeline of OTX is enough. We tried to give the model an answer that wasn't too hard with just randomness in the same two pipelines.
But, like you said, some performance issues are already known and we are struggling to come up with better recipes.
Thank you for your response!
As you are aware of the performance issues with the OTX implementation and you are struggling to come up with better ways to implement this, I would suggest using a SupCon ImageNet pretrained base and fine tuning with SupCon loss. As this has been shown to give better performance than using crossEntropy. This finetuning step only works well when the base has been pretrained with SupCon. Note that pretrained weights using SupCon are available on the repo's of the original authors. Could you share your experimental results and methods?
@Daankrol Thanks for your ideation! We'll try it and share it if there are notable results :)