centernet-lightning
centernet-lightning copied to clipboard
Checklist
Thoughts
- segmentation head (semantic and instance)
- instance segmentation requires some kind of pooling?
- human pose head
- tracker.py -> (outside model). include BYTE algo. can work with just pure detector (centernet) or with appearance embeddings (FairMOT)
- utilities: inference image folder, video. method or cli script?
- heatmap loss
- Gaussian: check other keypoint-based models e.g. HRNet
- Generalized focal loss?
- bbox training samples: only GT center point (CenterNet); 3x3 center area (FCOS and YOLOX); Gaussian area with weighted sum (TTFNet). 5x5 center area?
- to cleanup: test folder, other datasets, config files
- update FairMOT. mechanism to handle classifier head
- construction and how to inherit from CenterNet
- if build from scratch, needs to separate detection decode functions from centernet
- compute_loss: make a functions to get target boxes -> can reuse. but embeddings? which one will be used for training?
- heatmap loss: compute every image?
- multi-head support. all detectors use multi-head. may improve performance significantly
- probably don't separate objects by sizes to place on different feature maps. use some schemes like HRNet: train on all feature map levels, but during inference, fuse them together
- augmentations
- torchvision
- group by aspect ratio link. a good idea to avoid changing object's aspect ratios. does Albumentations have similar functions? does YOLOv5 have something in similar?
- new recipe pytorch/vision/issues/5307. still developing. presets and implementation
- YOLOv5: recipe
- mosaic
- mmdet? link
- TensorFlow? link
- torchvision
- model + train recipe with resnet50 should be competitive with other detectors
https://github.com/PaddlePaddle/PaddleDetection
TODO
Modelling
- [ ] Detections on multi-level feature maps
Training
- [ ] Refine transformation recipe for training and validation
- [ ] Implement Trivial Augment in Albumentations
- [ ] Implement Mosaic Augmentation
Done
- [x] Change TrackEval dependency to the original github repo (https://github.com/JonathonLuiten/TrackEval)
- [x] Use vision-backbones repo to replace current backbone + neck implementation (https://github.com/gau-nernst/vision-backbones)
- [x] Replace custom config with Lightning CLI (which uses jsonargparse)
- [x] Dataset: only support COCO to simplify pipeline
- [x] Add DDP support