Fine-tuning video captions

Open dvirla opened this issue 2 years ago • 1 comments

Thanks for your great work! I', trying to ,modify the image training code to video captioning fine tuning, but there are somethings that doesn't quite clear to me how to modify like using "answer" parameter in MPLUG model. Could you please release a train framework for this task?

I'm using vatex_video_caps_dataset class to load my dataset.

Thanks!

Mar 19 '23 14:03 dvirla

I think I've figured it out, I modified the dataset and the train call to pass the real captio as the "answer", is that the right way? If so, I can create a pull request for you to add this.

Mar 19 '23 17:03 dvirla