TransTrack
TransTrack copied to clipboard
Questions about implementation details
As introduced in the paper, TransTrack takers composite features (from different frames) as input for transformer during inference. But I find TransTrack take features from the same frame during training (only supports static-picture training). Moreover, when train current-frame decoder, it doesn't take pre-feature (combine the same features). Am I right?
It seems that self.decoder_track is trained under torch.randn(1).item() > 0.0 condition, self.decoder is trained otherwise. There are two questions:
- when we train self.decoder, the two part feature maps are the same which different from testing stage.
- when we train self.decoder_track, we match outputs of modified images with unchanged annotation to get indices.
I think it's hard to understand. Do I miss something important?
It seems that self.decoder_track is trained under torch.randn(1).item() > 0.0 condition, self.decoder is trained otherwise. There are two questions:
- when we train self.decoder, the two part feature maps are the same which different from testing stage.
- when we train self.decoder_track, we match outputs of modified images with unchanged annotation to get indices.
I think it's hard to understand. Do I miss something important?
Hi~
- We tried to train self.decoder with two feature from different images, the result is similar(even a little worse).
- We are verifying now changing annotations accordingly, and to see whether it helps. Once we get the result, we will update the result here
It seems that self.decoder_track is trained under torch.randn(1).item() > 0.0 condition, self.decoder is trained otherwise. There are two questions:
- when we train self.decoder, the two part feature maps are the same which different from testing stage.
- when we train self.decoder_track, we match outputs of modified images with unchanged annotation to get indices.
I think it's hard to understand. Do I miss something important?
Hi~
- We tried to train self.decoder with two feature from different images, the result is similar(even a little worse).
- We are verifying now changing annotations accordingly, and to see whether it helps. Once we get the result, we will update the result here
Thank you for your reply~
- About training self.decoder with different features, I got the same result (a little worse). But I wonder why different training strategy can get improvement.
- Looking forward for your updating.