Rectlabel-support
Rectlabel-support copied to clipboard
Video support
I have a couple of suggestions for additions to RectLabel:
-
Add ability to adjust the frame captured from a video, i.e., instead of grabbing all frames allow the user to select the frames to grab: 1 frame per four seconds, etc.
-
Ability to run inference on video clip and live video feed: https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture
-
I'm not sure if you've seen this, but CoreMLPlayer has some really cool features: https://github.com/npna/CoreMLPlayer
I've asked the developer to allow for exporting to Yolo format, like RectLabel, and to incorporate a camera input. Adding the capability of his app to RectLabel would be really fantastic.
We've also developed a quicktime based video player that supports drawing localization boxes on the video: https://github.com/mbari-org/Sharktopoda/releases/tag/2.0.3
Thanks for writing the issue.
Thanks for introducing the CoreMLPlayer and we checked how it works. We will implement your feature requests one by one.
- Frames per a second
- Running a Core ML model to a video clip and save the yolo txt files.
- Running a Core ML model to a live captured video and save the yolo txt files.
For the Sharktopoda, do you have a document how to use it?
Thank you.

Documentation is probably pretty sparse. From a users perspective, the instructions are here: https://docs.mbari.org/vars-annotation/setup/
The video player communicates with our annotation app, VARS, via UDP. The video player allows us to draw and display ML proposals on the video itself. VARS is reading/writing directly from a SQL Server database. The localizations are stored as a column in the db like: {"x": 1527, "y": 323, "width": 43, "height": 119, "generator": "yolov5-mbari315k" }. The class label would be in a different column.
I am sorry for late reply.
The new version 2023.12.08 was released. Improved so that when converting video to image frames, you can set frames per second.
I will implement one by one.