does the model support fine-tuning?
as title says,
if someone wants fine-tune the model, what's the process ?
OmniParser uses florence2 (or blip2-opt-2.7b) for captioning details and a custom trained yolo model for the icon detection. you can finetune both models as you like. I e.g. just trained the yolo model on my custom dataset. but be aware there is only one class embedded and thats the icon label.
you can then replace the yolo *.pt model in the weights/icon_detect folder with your own fine tuned one. for the training, just use the train_args.yaml provided and override stuff like device and batch.
same for the florence2 model. finetune and replace the *.safetensors in weights/icon_caption_florence .
i can provide some code if needed.
OmniParser uses florence2 (or blip2-opt-2.7b) for captioning details and a custom trained yolo model for the icon detection. you can finetune both models as you like. I e.g. just trained the yolo model on my custom dataset. but be aware there is only one class embedded and thats the icon label.
you can then replace the yolo
*.ptmodel in theweights/icon_detectfolder with your own fine tuned one. for the training, just use the train_args.yaml provided and override stuff like device and batch.same for the florence2 model. finetune and replace the *
.safetensorsinweights/icon_caption_florence.i can provide some code if needed.
hi tony, thanks very much
if you can provide an example would be great. Thank you