trackformer icon indicating copy to clipboard operation
trackformer copied to clipboard

how to add reid

Open liu0623 opened this issue 4 years ago • 11 comments

hi I want to add some reid network.But I do not know how to do it .could you help me? or give me some advice. Thank you!

liu0623 avatar Sep 29 '21 02:09 liu0623

Hello, adding a reID entwork should be quite straight forward as much of the reID code for comparing embedding distances is already implemented. You "only" need to add the forward of image patches with a reID network. You can find some examplary code in this repository which has a similar structure as this one.

timmeinhardt avatar Oct 04 '21 14:10 timmeinhardt

Thank you very much I have another question, what information does the track query contain? Feature information and location information or only location information.

liu0623 avatar Oct 09 '21 01:10 liu0623

The track query is not designed to contain any specific information. But from the understanding of how DETR works it mostly likely contains mostly class and location information.

timmeinhardt avatar Oct 09 '21 09:10 timmeinhardt

ok. in /src/trackformer/models/tracker.py line91 def add_tracks(self, pos, scores, hs_embeds, masks=None, attention_maps=None): """Initializes new Track objects and saves them. """ new_track_ids = [] for i in range(len(pos)): self.tracks.append(Track( pos[i], scores[i], self.track_num + i, hs_embeds[i], None if masks is None else masks[i], None if attention_maps is None else attention_maps[i], )) new_track_ids.append(self.track_num + i) self.track_num += len(new_track_ids) what about the hs_embed? It is the result of a decoder that has not been classified by ffn or something else. I don't understand.

liu0623 avatar Oct 11 '21 08:10 liu0623

Not sure what your question is. Yes the hs_embed is the output of the decoder before it gets forwarded to the classification and bounding box prediction FFNs. And this embedding can be used for some short-term reidentification. Due to the missing appearance information in the hs_embed embeddings were not able to get a true long-term reidentification running.

timmeinhardt avatar Oct 11 '21 12:10 timmeinhardt

First of all, thank you for your patience. I am very interested in this project. I have two guesses. hs_embed does not get appearance information by detr decoder. or hs_embed is a complex embedding which contains appearance information. Which one is right?

I have another three question: 1.the inactive_patience is 5 by default, have you tried to increase it? 2.When inference, the track queries and the object queries are decoder at the same time? 3.After the object query decoder, the new detection needs to match the inactivate_track by reid. however, the inactivate_track still saves the track query to retrieve the track. Therefore, I would like to know why the new detection needs to match the inactivate_track by reid.

liu0623 avatar Oct 11 '21 13:10 liu0623

The hs_embed contains any information the decoder needs to solve the detection task, i.e., class and location information. It is our assumption that it therefore does not contain much or any appearance information which is supported by our reID experiments with the embeddings.

  1. Yes we tried increasing it. Without substantial gains in reID performance.
  2. Yes, they are concatenated before going into the decoder. Please check the paper and code to get a better understanding of the inner workings of TrackFormer.
  3. I am not sure if I understand your question. The code has two types of reID implemented. One is by matching via embeddings distance. The other is by keeping on forwarding inactive track queries through the decoder.

timmeinhardt avatar Oct 11 '21 14:10 timmeinhardt

For question 3 Since inactive track queries can retrieve tracks by decoder, why add embeddings distance to match new detection and inactivate track? and in the two reid methods, which one run first?

liu0623 avatar Oct 12 '21 01:10 liu0623

The code has two reID methods but we only use one of them at a time. And for the final results of our paper we only use the track query reID and not the embedding distance method.

timmeinhardt avatar Oct 24 '21 16:10 timmeinhardt

OK. When I tested it, I turned on both reid methods at the same time . The result shows more id switch than using only track query. So if both methods are used at the same time, they will conflict?

liu0623 avatar Oct 25 '21 04:10 liu0623

The might conflict. We did not test the combination of both methods.

timmeinhardt avatar Oct 25 '21 09:10 timmeinhardt