recognize-anything
recognize-anything copied to clipboard
Some questions about fine-tuning recognize-anything model
Hello
I want to fine tune the recognize-anything model to label images with tags for real people or cartoon characters. I have two questions:
-
Would fine tune just the ram++ be enough, or do I also need to work on the text2tag part?
-
Also, I'm not sure how to go about this step. Could you please provide a detailed explanation?
Prepare pretained Swin-Transformer, and set 'ckpt' in ram/configs/swin.
thanks.
You can find some answers here: https://github.com/xinyu1205/recognize-anything/issues/173
I think you don't need the step "Prepare pretained Swin-Transformer". You just need to fine-tune the model. No need for steps 1 to 5.
I'm also trying to train the model. It is not an easy task!