info-ground
info-ground copied to clipboard
Some problems about the result and the implementation
Hi, @BigRedT Nice work and thank you for sharing the project! I have three issues when retraining the model.
-
Training with negative noun loss and language supervision loss on Flickr30K Entities (the default settings) tends to overfit, as shown in loss and metric curves. I have no idea what makes the overfitting and how to tackle it.
-
When employing the Bert model, why not assign the padding mask of each sentence to the model. In the present implementation, adding a different length of padding to the same sentence results in different encoding features. https://github.com/BigRedT/info-ground/blob/22ae6d6ec8b38df473e73034fc895ebf97d39897/exp/ground/models/cap_encoder.py#L143
-
Negative noun samplings are generated and recorded using the pre-trained Bert model during the pre-processing. In the training period, finetuning Bert or learning from scratch may make positive and negative samplings encoded by different models. It may decrease the performance of contrastive learning.