recognize-anything Some questions about fine-tuning recognize-anything model

Some questions about fine-tuning recognize-anything model

Open weijiafs opened this issue 10 months ago • 1 comments

Hello

I want to fine tune the recognize-anything model to label images with tags for real people or cartoon characters. I have two questions:

Would fine tune just the ram++ be enough, or do I also need to work on the text2tag part?
Also, I'm not sure how to go about this step. Could you please provide a detailed explanation?

Prepare pretained Swin-Transformer, and set 'ckpt' in ram/configs/swin.

thanks.

Apr 16 '24 09:04 weijiafs

You can find some answers here: https://github.com/xinyu1205/recognize-anything/issues/173

I think you don't need the step "Prepare pretained Swin-Transformer". You just need to fine-tune the model. No need for steps 1 to 5.

I'm also trying to train the model. It is not an easy task!

Apr 17 '24 09:04 adbmdp