bsldict
bsldict copied to clipboard
Implementation of MIL-NCE loss
Hi, thanks for your interesting work. I found the implementation of NCE loss somehow different from what is described in your paper.
- The dictionary videos have many redundant entries, as a class label can appear in multiple videos and be collected in multiple batches. I notice that all pairs of bsl-1k and dictionary features sharing the same class label are sampled as positive pairs even when they belong to two different batches, suggesting that some positive pairs can be included multiple times in the numerator. https://github.com/gulvarol/bsldict/blob/eea308a9ec13fade2659125d5f7fb9b26ce86577/loss/loss.py#L149-L154
- Pairs from different batches sharing same labels are excluded in the denominator yet included in the numerator. https://github.com/gulvarol/bsldict/blob/eea308a9ec13fade2659125d5f7fb9b26ce86577/loss/loss.py#L169-L172
- At last, for each batch, the log ratio is computed by iterating over the duplicated dictionary entries (num_unique_dicts). According to this paper, it seems more reasonable to iterate over BSL-1k videos. https://github.com/gulvarol/bsldict/blob/eea308a9ec13fade2659125d5f7fb9b26ce86577/loss/loss.py#L177-L179
Moreover, the difference between using BSL-1k video or dictionary video as anchors for contrastive sampling is not reflected in the code implementation. Is it just a concept for better explaining construction of positive/negative pairs?
Thanks a lot for your help~