prismer
prismer copied to clipboard
how do we perform image captioning on custom dataset?
It would be great if you could provide guidance to train prismer image captioning model on custom dataset.
Thanks
Hi @ChandanVerma, thanks for the reach out. Fine-tuning on the custom dataset is very straightforward. You just need to 1) Prepare your .json data list file, similar to what I did for COCO. 2. Set up the training config in here: https://github.com/NVlabs/prismer/blob/main/configs/caption.yaml. That's pretty much it.
I would suggest first to make it work with COCO with my provided script, to familiar with the codebase. And then everything else would be pretty straightforward.
Sure @lorenmt thanks for the heads up. Will definitely try and let you know if I face any issues.
Please re-raise an issue when you have other more concrete questions.