MONAILabel icon indicating copy to clipboard operation
MONAILabel copied to clipboard

Video Annotation with MONAI Label - CVAT support

Open tangy5 opened this issue 3 years ago • 2 comments

Current endoscopy app annotation works with frame, the scoring method recommends 2D images for task. This issue discusses the integration of video annotation with current tooltracking and deid models.

Endoscopy APP Video Annotation logic:

  1. MONAI Label starts studies with endoscopy videos, avi, mp4 format.
  2. The MONAI Label server solves video frames as input and predict scores for each frame.
  3. Video-based scorning module calculates the metric.
  4. MONAI Label server push suggested videos (avi, mp4) to CVAT based on metric.
  5. Users annotate video frames on CVAT. Mark complete until all frames in video are annotated.
  6. MONAI Label server periodically query if there are finished tasks.
  7. MONAI Label server train/fine-tune model with new annotated video frames.
  8. MONAI Label server predict scores for video frames, and loop processing for active learning Step 2 – Step 8

Development steps:

  • [x] 1. Prepare endoscopy sample videos, and dataset.
  • [x] 2. Create video reader and image loader with videos for tooltracking, deid function modules.
  • [x] 3. Create a scoring method based on videos, instead of 2D image.
  • [ ] 4. Push/transfer suggested video from MONAI label to CVAT
  • [x] 5. MONAI Label query module on periodically check whether videos are annotated.
  • [x] 6. Datastore communicating video and video labels between MONAI Label and CVAT.
  • [x] 7. Check whether active learning loops compatible with videos.

Video annotation and active learning workflow based on prior frame integration. method

@SachidanandAlle @Nic-Ma Correct me if there are any thoughts. CVAT can only send frame requests to functions. We need to process the video loader at model data loader, and a video-based scoring method.

tangy5 avatar Sep 19 '22 03:09 tangy5

@SachidanandAlle: Is video annotation functionality available in the current release of MONAILabel?

bilalcodehub avatar Aug 22 '23 19:08 bilalcodehub

I think still not.. Training happens on per frame.. so annotating frame by frame..

but CVAT helps you to run it over the whole video in single click..

@tangy5 can share how to support videos.. he has done some prototype to consider videos in your studies folder instead of 2D images and run Infer on the video instead of frame

SachidanandAlle avatar Aug 23 '23 15:08 SachidanandAlle