ViS4mer icon indicating copy to clipboard operation
ViS4mer copied to clipboard

Results 8 ViS4mer issues
Sort by recently updated
recently updated
newest added

Hi! After reading your paper, I wonder how to use Transformer Encoder to encode each video frame in parallel?

Hi, I have run this code on lvu dataset, but the output is nan during training. Could you please provide pretrained checkpoints on speaking task of lvu? thank you very...

Hi authors, How are the durations in lvu_durations.csv computed? The last 20s in most videos show preview for other videos. Does lvu_durations.csv show the number of seconds in the video...

Hi authors, I'm getting NaNs in the training loss in the first epoch itself. I've tried 3 different seeds on relationship task, and it resulted in NaNs each time. Is...

ReduceLROnPlateau by default assumes a "min" metric (https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html) `mode ([str](https://docs.python.org/3/library/stdtypes.html#str)) – One of min, max. In min mode, lr will be reduced when the quantity monitored has stopped decreasing; in...

Hi, I tried to train a model on the LVU dataset, but got acc about 0.25 and the loss is NAN,I want to know what else should I do ?thanks.

Hi Will you publish any pre-trained model? Preferably in torchHub? I was thinking of using ViS4mer for extracting image embedding.

Hello, thank you for your excellent work. When I tried to download the dataset from the LVU official link, I found that they did not provide the raw video, and...