segment-anything
segment-anything copied to clipboard
interesting video object segmentation (and tracking) result
Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !
Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !
Have you tried this repo? https://github.com/kadirnar/segment-anything-video
Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !
Have you tried this repo? https://github.com/kadirnar/segment-anything-video
I just tried this repo u provided, and it gives the exact same problem with the GPU (no output nor an error message)
Can you share the image file?
I assume you're asking for the images for the demo above, it's a sequence called "horsejump-stick" from https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-test-dev-Full-Resolution.zip There are many other sequences in DAVIS dataset https://davischallenge.org/index.html And there are many other datasets for video object segmentation and tracking like YoutubeVOS
You can find other datasets at: https://youtube-vos.org/ https://henghuiding.github.io/MOSE/ https://github.com/Ali2500/BURST-benchmark
Hi that is a really interesting result and thank you for sharing! May I ask how you get rid of the background (and all other occlusions) but only keep on the people + horse? I tried to run SAM on a video but it seems that all objects in every frame will result in a different class, so I could not track the class which I am looking for in a video.
- Initialization: first frame with mask of the foreground (in this case, the person and the horse)
- Sample points in the foreground and use them as prompts to generate masks of different parts
- Given a new frame, get new points by propagation of the previous points
- Use new points as prompts to generate masks of different parts in the new frame
- Keep track of the masks of different parts across frames by feature matching
- Iterate 3-5 until the end of the sequence
Take a look at https://github.com/z-x-yang/Segment-and-Track-Anything