huggingface.js
                                
                                 huggingface.js copied to clipboard
                                
                                    huggingface.js copied to clipboard
                            
                            
                            
                        Adding a new task to Hub: object tracking in videos
Wanted to open the discussion from the discord, @osanseviero what is required for us to have a separate pipeline/task for object tracking in videos? Pinging @sgugger too!
To have a pipeline in Transformers, we'd need pretrained models that do this. I don't think that is the case right now.
To have a pipeline in Transformers, we'd need pretrained models that do this. I don't think that is the case right now.
Models:
- https://huggingface.co/kadirnar/osnet_x0_5_imagenet
- https://huggingface.co/kadirnar/osnet_x0_25_imagenet
- https://huggingface.co/kadirnar/osnet_x1_0_imagenet
Supported tracking algorithms:
- [X] Sort
- [X] StrongSort
- [X] ByteTrack
- [X] OcSort
- [X] Norfair
Demo: https://huggingface.co/spaces/kadirnar/torchyolo
Hey all! Let me copy-paste the template for tracking new tasks
Note that you're not expected to do all of the following steps. This helps track all the steps required to get a new task fully supported in the Hub 🔥
- [ ] Integration with Inference API. Select at least one of the following:
- [ ] Added a transformerspipeline
- [ ] Added to Community Inference API for 3rd party library
- [ ] Added to Community Inference API for generic
 
- [ ] Added a 
- [ ] Added basic UI elements (icon, order specification, etc)
- [ ] Added a widget
Integration guide: https://hf.co/docs/hub/models-tasks
@osanseviero @sgugger I was thinking more for whether there should be a separate task for this or if this could be covered under object detection as is in our ecosystem, that's why I asked above question 🙂
I think it's different, as one task operates with video as inputs (so temporal information) while the other just operates with static input images.