TemporalAlignNet
TemporalAlignNet copied to clipboard
negative pairs
Hello, thank you for the excllcent work. Though I have a few conerns about the first stage of training with contrastive loss function. I notice that this work considers the negative pairs inside the video, which is different to the conventional way that considers the other samples in batch. I am wondering if I am understanding it correctly and if I am, do you analyze the evaluation performance with only the pretraining stage? Thank you in advance!