2018AICity_TeamUW icon indicating copy to clipboard operation
2018AICity_TeamUW copied to clipboard

Can I use it to train for Multiple camera Multiple person tracking problem?

Open KunalArora opened this issue 4 years ago • 14 comments

Hello organizers,

Thank you for the code. It's a great work done. I want to do a research for my thesis to create a system to track Multiple people in Multiple camera scenario. I totally believe your code can be extended or your models could be trained to do that. Will you please share some insights if it's possible?

KunalArora avatar Mar 19 '20 14:03 KunalArora

Sure. The same code can be applied to person scenario. You just need to change the object detector to output person locations.

zhengthomastang avatar Mar 19 '20 19:03 zhengthomastang

Thank you for the response. But, I believe I need to retrain the whole system, Detector, Re-Id and tracking on specific person datasets, right?

And, as you mentioned I need to change the object detector to output person locations, that means I need to make changes into the detection/tools/infer_simple_txt.py to achieve it as that is been called from run.sh, right?

KunalArora avatar Mar 19 '20 21:03 KunalArora

The provided pre-trained models for YOLOv2 cannot be used to detect people. You can use the pre-trained models on ImageNet or MS COCO instead. All you need to do is to extract on detected people from the results. We suggest you to try more advanced object detectors like YOLOv3 and Faster R-CNN. The pre-trained models provided with them should be accurate enough. The ReID and tracking parts are not dependent on the object types.

zhengthomastang avatar Mar 20 '20 02:03 zhengthomastang

Okay, great. So, it means if I am able to change the detector to output person detections, I don't need to make further changes to ReID and Tracker as they follow the input from detection, right.

Any help where I have to make these changes? A more detailed reference would be appreciated
Also, I believe I can use this code for real-time tracking, right

KunalArora avatar Mar 20 '20 09:03 KunalArora

For object detection, there is nothing much you need to change. All you need is to use the pre-trained models to generate detection results and extract the people objects from them. For ReID, we used transfer learning, i.e., using the pre-trained model to extract features, so there is no need for training, however, we found that using metric learning will lead to better performance. You can refer to our latest paper in CVPR 2019 about the CityFlow dataset. We also have a better single-camera tracker that you can find here: https://github.com/ipl-uw/2019-CVPR-AIC-Track-1-UWIPL.

Since our code has been divided to separate components, you may need to integrate them into a standalone pipeline for real-time tracking.

zhengthomastang avatar Mar 20 '20 21:03 zhengthomastang

Can you please provide me the actual repository for Multiple camera Vehicle tracking code? This is because the Track 3/1_Multi-Camera Vehicle Tracking and Re-identification folder does not have code in it except a Readme.md

KunalArora avatar Mar 23 '20 17:03 KunalArora

You can find the link to all the repositories we used here: https://github.com/zhengthomastang/2018AICity_TeamUW/tree/master/Track3

The main repository is this one: https://github.com/AlexXiao95/Multi-camera-Vehicle-Tracking-and-Reidentification

zhengthomastang avatar Mar 23 '20 17:03 zhengthomastang

Okay, thank you soo much for the response. One more thing, Is it possible to train and run this on CPU only without any GPU support?

KunalArora avatar Mar 23 '20 17:03 KunalArora

Yes. It is possible to extract features with CPU only. You can also try more advanced pre-trained models in PyTorch, which are probably easier for inference on CPU.

zhengthomastang avatar Mar 23 '20 17:03 zhengthomastang

@KunalArora Have you been able to get the multi-camera tracker work? I am working on a similar problem and want to know what modifications are needed to get this working asap(apart from changing the backend detector).

haroonrashid235 avatar Apr 06 '20 17:04 haroonrashid235

@zhengthomastang

demo.mp4

Is this the result of your demo? If yes, can you please confirm whether I can extend it to multiple targets? The demo shows only 1 target vehicle which it is tracking across multiple cameras. Can you also please comment of fps that you are getting?

haroonrashid235 avatar Apr 06 '20 17:04 haroonrashid235

@haroonrashid235

Yes. The demo was generated using the code in this repository. You can use it to extend to multiple targets. For the work in the 2018 challenge, we only selected the targets with the highest confidence because there are too many false positives. We didn't compute the FPS because the pipeline was broken down into multiple modules. There still needs further work to combine them into an end-to-end framework.

zhengthomastang avatar Apr 06 '20 17:04 zhengthomastang

@haroonrashid235 I am still working on this task of making it work for people and developing the end-to-end pipeline from detection till tracking.

@zhengthomastang
I would really appreciate your help in letting me know what could be done to develop the end-to-end pipeline. A general guideline or idea would help a lot.

KunalArora avatar Apr 07 '20 15:04 KunalArora

@KunalArora You can refer to my paper to get an idea of the workflow of multi-target multi-camera (MTMC) tracking: https://zhengthomastang.github.io/publications/CityFlow/

zhengthomastang avatar Apr 07 '20 15:04 zhengthomastang