cvat icon indicating copy to clipboard operation
cvat copied to clipboard

Idea: Annotating a sequence of images

Open doantientai opened this issue 10 months ago • 1 comments

Discussed in https://github.com/cvat-ai/cvat/discussions/7818

Originally posted by doantientai April 29, 2024 I am working on a project in which we have to give a label status of objects (a classification problem). However, we have to look at the object from multiple angles, so we took multiple images of each object. For the annotation step, now we have to display a sequence of images each time to label an object. Surprisingly, it seems there is no native solution for this. So I am trying two solutions, but they are both come with drawbacks:

  • Contextual images: For each object, I display 1 main image + 12 contextual images so show other angles. However, sometimes there are more than 13 images for each object, so I will lose some images if I choose this solution. Plus, the main image is huge comparing to the contextual ones.
  • Concatenating images: I can merge all the views of each object as one single image. The problem is the views are cropped in different sizes and shapes, moreover, the number of views of each object can be different, making it difficult to merge them without having to resize to fit them in the same grid.

What I would like to suggest, is the ability to have more contextual images (like 50), and the ability to hide the main image so that it does not take too many space on the interface.

Thank you!

doantientai avatar Apr 29 '24 08:04 doantientai

I am facing a similar problem where I'm trying to label the differences between two images. So viewing the second image simultaneously would be very helpful

realtimshady1 avatar Jun 14 '24 07:06 realtimshady1