Shruti Palaskar
Shruti Palaskar
Hi, when supplying translate.lua with the gold targets, it reads the target file line by line and not by feature ID. So gold and pred don’t match, and gold has...
Hello, I am using pre-trained VL-T5 to generate captions for Flickr30K images off-the-shelf i.e. without any finetuning. I modified the captioning scripts to predict directly. I observe very short captions...
Hi Jaemin, Thanks for the very interesting paper and releasing your codebase! I have been working with your codebase for a different multimodal text generation task and observe lower performance...