training number of ofa-cn-muge

Open yangjianxin1 opened this issue 3 years ago • 1 comments

    It is fine to just use the pretrained model as it is pretrained on many image-text pairs. See [https://github.com/OFA-Sys/OFA/blob/main/checkpoints_cn.md](https://github.com/OFA-Sys/OFA/blob/main/checkpoints_cn.md). To achieve a better effect, finetuning on domain-specific data is recommended. Now we only provide one caption model finetuned on MUGE caption data, which are collected from the e-commerce.

Originally posted by @JustinLin610 in https://github.com/OFA-Sys/OFA/issues/227#issuecomment-1236575608

How many training data do you use to finetune OFA-CN-MUGE？50000 images in the ECommerce-IC.zip from https://tianchi.aliyun.com/dataset/107332 ?

Jan 03 '23 09:01 yangjianxin1

Yes, for muge, we only finetune with the official training data.

Jan 05 '23 14:01 JustinLin610