Md Mohaiminul Islam

Results 6 comments of Md Mohaiminul Islam

Hi, Thanks for reaching out. We used the duration from [Condensed Movies](https://www.robots.ox.ac.uk/~vgg/research/condensed-movies/) dataset. They removed the outro/preview from each video which they describe in section 3.1 of their [Paper](https://arxiv.org/pdf/2005.04208.pdf). Therefore,...

Hi, I think you are right. You need to remove the outro first and we also did that. You can use the duration from 'lvu_durations.csv' to do this.

Which task did you try and what performance are you getting? Also, how did you solve the 'NaN' issue? Can you please reply that on the other issue so that...

Hi @nahidalam, Thanks for your comment. I will try to publish some pre-trained model weights. Do you want the pretrained model of any particular dataset? For the LVU dataset, there...

Yes, ViS4mer is a video understanding model. However, technically you can use it for image modeling too. Anyway, I will try to release the pretrained weights for the scene/place, relationship,...

Hi, That's a very good observation. I guess it was not intended. You can try changing the mode of ReduceLROnPlateau to 'max'. Thanks