mmtracking
mmtracking copied to clipboard
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
How can we conduct multimodal multi-objective tracking and how is it evaluated?
I've encountered an issue where the JSON files generated by `imagenet2coco_vid.py` are significantly larger than what was provided in the [TransVOD documentation](https://drive.google.com/drive/folders/1cCXY41IFsLT-P06xlPAGptG7sc-zmGKF). I am curious if there's a particular method...