training_extensions Decreased performance with SupCon Loss in Incremental Learning Research Framework

Hi OTX team,

Our results show that using Supervised Contrastive Loss shows decreased performance. I discussed this with Intel Labs, who noted that this might be caused by two things. First, is the pretrained base model used in OTX pretrained with the SupCon loss or is this just the normal pretrained base that used (e.g.) CrossEntropy? Second, the SupCon approach heavily depends on strong augmentations but the augmentations in OTX differ from the ones used in the FRE paper. Also, it is unclear what the difference in augmentations is for the SupCon and normal implementation in OTX, especially for the AugMix module.

Augmentations when using SupCon: https://github.com/openvinotoolkit/training_extensions/blob/develop/otx/algorithms/classification/configs/base/data/supcon/data_pipeline.py

Augmentations when using normal classification (cross entropy): https://github.com/openvinotoolkit/training_extensions/blob/develop/otx/algorithms/classification/configs/base/data/data_pipeline.py

Jun 28 '23 15:06 Daankrol

@sungchul2 , could you take a look this?

Jun 28 '23 15:06 sungmanc

Thanks for your interest, @Daankrol

The latter. Unlike the original SupCon, which pre-trains the model with label information, our SupCon targets a fully fine-tuning scheme using both cross-entropy loss and contrastive loss, like multi-task learning.
Several experiments showed that strong augmentations used in the original weren't suitable for fine-tuning the model and that the normal pipeline of OTX is enough. We tried to give the model an answer that wasn't too hard with just randomness in the same two pipelines.

But, like you said, some performance issues are already known and we are struggling to come up with better recipes.

Jun 29 '23 01:06 sungchul1

Thank you for your response!

As you are aware of the performance issues with the OTX implementation and you are struggling to come up with better ways to implement this, I would suggest using a SupCon ImageNet pretrained base and fine tuning with SupCon loss. As this has been shown to give better performance than using crossEntropy. This finetuning step only works well when the base has been pretrained with SupCon. Note that pretrained weights using SupCon are available on the repo's of the original authors. Could you share your experimental results and methods?

Jul 06 '23 08:07 Daankrol

@Daankrol Thanks for your ideation! We'll try it and share it if there are notable results :)

Jul 07 '23 02:07 sungchul1