prismer icon indicating copy to clipboard operation
prismer copied to clipboard

how do we perform image captioning on custom dataset?

Open ChandanVerma opened this issue 2 years ago • 2 comments

It would be great if you could provide guidance to train prismer image captioning model on custom dataset.

Thanks

ChandanVerma avatar Mar 07 '23 13:03 ChandanVerma

Hi @ChandanVerma, thanks for the reach out. Fine-tuning on the custom dataset is very straightforward. You just need to 1) Prepare your .json data list file, similar to what I did for COCO. 2. Set up the training config in here: https://github.com/NVlabs/prismer/blob/main/configs/caption.yaml. That's pretty much it.

I would suggest first to make it work with COCO with my provided script, to familiar with the codebase. And then everything else would be pretty straightforward.

lorenmt avatar Mar 07 '23 14:03 lorenmt

Sure @lorenmt thanks for the heads up. Will definitely try and let you know if I face any issues.

ChandanVerma avatar Mar 07 '23 14:03 ChandanVerma

Please re-raise an issue when you have other more concrete questions.

lorenmt avatar Mar 11 '23 15:03 lorenmt