FairMOT icon indicating copy to clipboard operation
FairMOT copied to clipboard

problem of crowhuman datase?

Open angryhen opened this issue 4 years ago • 7 comments

The crowdhuman dataset doesn't have any ID tag, so how it's training successfully?

angryhen avatar Dec 21 '20 13:12 angryhen

We give each bbox a unique id and the total ids are larger than 1,000,000.

ifzhang avatar Jan 15 '21 08:01 ifzhang

Can you share the details about training of the self-supervised learning on crowdhuman? thank you

zengjie617789 avatar Jul 20 '21 08:07 zengjie617789

I would also be interested in learning more about the self-supervised learning part (especially about the implementation).

Great work and great model!

richardvogg avatar Sep 01 '21 13:09 richardvogg

As far as i am concerned, the self-supervised learning in FariMOT is treating the process of refining object's features as a classification task, that is mean the num of classes is the same as the objects in datasets, such as 100000 in Crowdhuman dataset.

zengjie617789 avatar Sep 02 '21 03:09 zengjie617789

Thanks for the fast comment! :) You are probably right. When I read the paper, I was hoping that there would be some more contrastive-learning-type action happening. When I read the quote from the FairMOT paper which I pasted below, I thought there would be several instances of each image, created by random transformations, which could then be used to learn the representation.

"Inspired by [51], we regard each object instance in the dataset as a separate class and different transformations of the same object as instances in the same class. The adopted transformations include HSV augmentation, rotation, scaling, translation and shearing." (51 is this paper)

richardvogg avatar Sep 02 '21 08:09 richardvogg

So during the pre-training on CrowdHuman or other image level datasets such as COCO, the "self-supervise" means that only the Re-ID head is self-supervised training, the rest three heads is "supervised training". Did I get it right?

GandalfTGrey avatar Jan 25 '22 08:01 GandalfTGrey

And in this designed self-supervised training task, the Re_ID head is encouraged to distinguish as many instances as possible?

GandalfTGrey avatar Jan 25 '22 08:01 GandalfTGrey