cdoersch
cdoersch
Yes, this trick can also be used for training if you call the relevant functions. I don't see what would prevent you from doing that. Regarding memory, we haven't done...
Closing this due to inactivity. The API has changed, which hopefully makes it easier to avoid OOMs, but cost volumes are also fundamentally large, so we can't avoid all OOMs.
It's a bit difficult to define "epochs" since we sample different points on every step. Our internal dataset is more like 100K videos, and the batch size is 8 per...
Yes, this is intentional. In the original tapvid, the prediction is trivial for the query frame, since the model can simply output the query point. However, for most tapvid3d metrics,...
Yes, this is the same as with the original TAP-Vid. For methods like CoTracker it should be straightforward to reverse the direction of the video to track backward in time.
Yes, what's stored in the pickle file is normalized. As usual for image coordinates, 0,0 is the upper-left corner. The reader re-normalizes these to actual pixel coordinates, i.e., multiplies by...
Sorry, not sure I'm following. What hardware are you using? What script are you using to load the data? What loading speed are you seeing, versus what do you expect?
I need some more context in order to help you. Can you provide the script you're using? Have you tried plotting the query points on the query frame?
Sorry, we haven't been releasing much on the training loss curves because it's difficult to maintain them. BootsTAPIR took about 2 weeks to train on YouTube data on 256 A100...
This is a legacy parameter used to compute TAP-Net losses. If you want to get query features from TAPIR, you should directly call the `get_query_features()` method at https://github.com/google-deepmind/tapnet/blob/main/tapnet/torch/tapir_model.py#L216 . This...