RMN
RMN copied to clipboard
IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning
Hi,tgc! I tried using Torch's fasterrcnn_resnet50_fpn pre-trained model to extract the region_features of the video, but found that the feature shapes I extracted were only [823, 4], which is far...
Hello, may I ask what the method do you use to extract features and regional features from videos?Thank you
when I run sample.py line 102, in net.load_state_dict(torch.load(opt.model_pth_path)) RuntimeError: Error(s) in loading state_dict for CapModel: Unexpected key(s) in state_dict: "decoder.module_selection.loc_fc.weight", "decoder.module_selection.loc_fc.bias", "decoder.module_selection.rel_fc.weight", "decoder.module_selection.rel_fc.bias", "decoder.module_selection.func_fc.weight", "decoder.module_selection.func_fc.bias", "decoder.module_selection.module_attn.wh.weight", "decoder.module_selection.module_attn.wh.bias", "decoder.module_selection.module_attn.wv.weight", "decoder.module_selection.module_attn.wv.bias", "decoder.module_selection.module_attn.wa.weight"....
Which directory is this msr-vtt_model.pth in?
(rmn) E:\video_caption\rmn\RMN-master>python evaluate.py --dataset=msvd --model=RMN --result_dir=results/msvd_model --attention=gumbel --use_loc --use_rel --use_func --hidden_size=512 --att_size=512 --test_batch_size=2 --beam_size= 2 --eval_metric=CIDEr 335it [01:21, 4.13it/s] init COCO-EVAL scorer tokenization... Traceback (most recent call last): File "evaluate.py",...
How to apply the code to my own dataset? Could you please provide the code about feature extraction?