joaanna
joaanna
also when running the command, it seems like the model weights are not correctly loaded, I get `Missing keys in state-dict: ['encoder.resnet.1.num_batches_tracked', 'encoder.resnet.4.0.bn1.num_batches_tracked', 'encoder.resnet.4.0.bn2.num_batches$ tracked', 'encoder.resnet.4.0.bn3.num_batches_tracked', 'encoder.resnet.4.0.downsample.1.num_batches_tracked', 'encoder.resnet.4.1.bn1.num_batches_tra$ ked', 'encoder.resnet.4.1.bn2.num_batches_tracked',...
Hi, we sample frames using frames_list and coordinates using coor_frame_list, i.e. the appearance models (I3D or STRG) will use frames samples from `frames_list` and the STIN networks will use bbox...
Hi Rui, One thing that might be causing the lower performance is that you are setting `num_frames=4`, we trained our coordinate models on 8 frames, try that.
Hi Rui, thank you for the bugs report! Feel free to send a pull request.
hey, great work! I am also trying to understand your code better. In Video Bert and also the parameters used here you take 4 HIERARCHIES and 12 clusters. The paper...
That makes sense, thank you! Another question, I am able to run the clustering with this command: python3 -m hkmeans_minibatch -r features -p ft_hp -b 40 -s vecs_dir -c centroid_dir...