segment-anything
segment-anything copied to clipboard
Finetuning
Is there any plans to release scripts for finetuning the model?
Also you did such a great work! Thank you very much!
Information on fine tuning would be great.
+1, I'd love to be able to fine tune to improve performance on extremely difficult tiny-object tasks, for example segmenting vehicles in geospatial images:
this thread is referenced as the answer for similar questions, but I don't think there is an answer here for transfer learning?
Look forward to finetuning
I would love to be able to fine tune the model for specific datasets as well.
Do we wait for Meta to provide a training/fine-tuning script? Or should the open source hivemind write it?
Has anyone tried the idea of what may be called "point prompt engineering"? i.e. training a separate model that learns how to put positive prompt points and negative prompt points, such that these points prompt SAM to select target objects from a custom dataset.
Or we can just summarize strategies and best practices in terms of placing positive and negative prompt points/prompt boxes, similar to how GPT/DALLE users summarize the best ways to write prompts.
Wonder if this could be one way to fine-tune the SAM model when only a limited amount of annotations are available. Happy to discuss more if anyone wants to work together and try it out.
+1, Looking forward to fine-tuning the SAM model on the custom dataset.:)
I am attempting some fine tuning in this repo. Perhaps people can find use in it. The biggest thing I figured out is that you have to break up the Sam
model into its components in order for there to be a gradient path for fine-tuning.
After some messing around I have gotten preliminary fine-tuning to work on my fork. The code is still super messy and early, but perhaps people can find use in it. The biggest thing I figured out is that you have to break up the
Sam
model into its components in order for there to be a gradient path for fine-tuning.
Could you please recommend the minimum hardware configuration for fine-tuning the SAM? eg. 4090 x 4?
Could you please recommend the minimum hardware configuration for fine-tuning the SAM? eg. 4090 x 4?
I can get the smallest pre-trained model (vit_b
) with a batch size of 1 in <5GB of GPU memory, but I think fine tuning with those settings would take forever.
I have access to a 4 x A100 /w 80G if you want me to test something.
hi @hu-po ,
Thanks for sharing the fine-tuning code very much. Would it be possible for you to give guidance on how to prepare the customized dataset (e.g., data format and folder structures)?
hi @hu-po ,
Thanks for sharing the fine-tuning code very much. Would it be possible for you to give guidance on how to prepare the customized dataset (e.g., data format and folder structures)?
Thank me when I get it to work 😭 this is more complicated than anticipated.
+1, interested in fine-tuning it for coral reef images.
+1 interested in fine-tuning it for cracking on roads.
+1 🙌
+1 interested in fine-tuning!
+1, I'd like to do some vehicle detection on low quality images!
+1 interested in fine tunning prompt encoder or mask decoder!
+1! I would be interested in fine-tuning the model for medical image analysis
I'm curious that is it possible to point out an unknown object have not been learned (like anomaly detection) by text prompt if I fine-tune with custom data.
+1!
CC: @ericmintun @nikhilaravi
@hu-po hi, nice work for sharing finetune script , is "FragmentDataset" the datasets that released by official datasets https://segment-anything.com/dataset/index.html
@hu-po hi, nice work for sharing finetune script , is "FragmentDataset" the datasets that released by official datasets https://segment-anything.com/dataset/index.html
No, it's a custom dataset for x-ray data of scroll fragments: https://www.kaggle.com/competitions/vesuvius-challenge-ink-detection/data
I have a finetune starter code for COCO instance segmentation format data with some basic functionalities at this repo. Hope it would help!
Hey, we wrote a blog post outlining some of the key steps to fine tune SAM using the mask decoder, particularly describing which functions from SAM to use to pre/post process the data so that it's in a good shape for fine tuning.
@hu-po hi, nice work for sharing finetune script , is "FragmentDataset" the datasets that released by official datasets https://segment-anything.com/dataset/index.html
No, it's a custom dataset for x-ray data of scroll fragments: https://www.kaggle.com/competitions/vesuvius-challenge-ink-detection/data
I know those guys!
I am a student and I am also looking forward to the release of the fine-tuning to complete my academic paper, and I would be very grateful if it is released