segment-anything icon indicating copy to clipboard operation
segment-anything copied to clipboard

interesting video object segmentation (and tracking) result

Open Pilot-LH opened this issue 1 year ago • 6 comments

low

Pilot-LH avatar Apr 10 '23 18:04 Pilot-LH

Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !

AbdelazizHamadi avatar Apr 10 '23 18:04 AbdelazizHamadi

Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !

Have you tried this repo? https://github.com/kadirnar/segment-anything-video

kadirnar avatar Apr 10 '23 19:04 kadirnar

Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !

Have you tried this repo? https://github.com/kadirnar/segment-anything-video

I just tried this repo u provided, and it gives the exact same problem with the GPU (no output nor an error message)

AbdelazizHamadi avatar Apr 10 '23 20:04 AbdelazizHamadi

Can you share the image file?

kadirnar avatar Apr 10 '23 20:04 kadirnar

I assume you're asking for the images for the demo above, it's a sequence called "horsejump-stick" from https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-test-dev-Full-Resolution.zip There are many other sequences in DAVIS dataset https://davischallenge.org/index.html And there are many other datasets for video object segmentation and tracking like YoutubeVOS

Pilot-LH avatar Apr 12 '23 01:04 Pilot-LH

You can find other datasets at: https://youtube-vos.org/ https://henghuiding.github.io/MOSE/ https://github.com/Ali2500/BURST-benchmark

bhack avatar Apr 12 '23 01:04 bhack

Hi that is a really interesting result and thank you for sharing! May I ask how you get rid of the background (and all other occlusions) but only keep on the people + horse? I tried to run SAM on a video but it seems that all objects in every frame will result in a different class, so I could not track the class which I am looking for in a video.

DavidTu21 avatar Apr 19 '23 04:04 DavidTu21

  1. Initialization: first frame with mask of the foreground (in this case, the person and the horse)
  2. Sample points in the foreground and use them as prompts to generate masks of different parts
  3. Given a new frame, get new points by propagation of the previous points
  4. Use new points as prompts to generate masks of different parts in the new frame
  5. Keep track of the masks of different parts across frames by feature matching
  6. Iterate 3-5 until the end of the sequence

Pilot-LH avatar Apr 19 '23 13:04 Pilot-LH

Take a look at https://github.com/z-x-yang/Segment-and-Track-Anything

bhack avatar Apr 21 '23 00:04 bhack