Groma
Groma copied to clipboard
Finetuning and dataset formatting guidelines
Very cool work and congrats on what you have accomplished. Wanted to know if you all had plans to release a finetuning guide and how to format datasets
Hi there, thanks for your interest in our work. Here are some tips you may follow to finetune the model on customized datasets:
- Format your data. There are various dataset templates under
groma/data/datasets. For example, you can refer torefcoco_rec.pyto format REC data,visual_genome.pyfor region captioning,llava.pyfor conversation, and so on. BTW, don't forget to register the new dataset ingroma/data/build.py. - Download the pretrained checkpoint groma-7b-pretrain.
- Config
groma/data/configs/vl_finetune.pyandscripts/vl_finetune.sh, then runbash scripts/vl_finetune.sh {path_to_groma_7b_pretrain_ckpt} {output_dir}.
thank you for your response. Perhaps if i have some time i can update documentation and provide a fine-tuning section. Someone else may be able to get to it sooner than me