Jianyuan Wang
Jianyuan Wang
It seems the api used in https://github.com/VladimirYugay/vggt_inference/blob/main/vggt_eval_scannet.py has already considered this. Its source is from https://github.com/MichaelGrupp/evo/blob/86f52ade6da8cc4749c6170b1d2771ea1e0f1c66/evo/main_ape.py#L42, as ``` def ape(traj_ref: PosePath3D, traj_est: PosePath3D, pose_relation: metrics.PoseRelation, align: bool = False, correct_scale:...
Hi @y6216886 , We did not implement this ourselves, but you can (1) Run VGGT over different image subsets and then align all of them. (2) Distribute the inference over...
Hi @bidbest , The predictions from VGGT exist within normalized coordinate spaces, but each subset resides in its own separate normalized space. Therefore, while you can assume these spaces are...
Hi yes I think it is also possible to use the camera poses to find the scale, although did not try it myself before. If we can assume different scenes...
Hi, we haven't tried it ourselves
Hi, Starting from PyTorch 2.2, its built-in `scaled_dot_product_attention` function has integrated support for FlashAttention v2, so there’s no need to install FlashAttention v2 manually. If you’re looking for faster inference,...
Hi @ZhangGongjie , Thanks for the interest! Yes we are testing this feature, and should be included in our next version. Basically it requires to finetune a model to accommodate...
It was trained on 64 GPUs for 1-2 days , so approximately 1k-2k GPU hours.
Hi, If you only want intrinsic and extrinsic parameters, you can just do: ``` with torch.no_grad(): with torch.cuda.amp.autocast(dtype=dtype): images = images[None] # add batch dimension aggregated_tokens_list, ps_idx = model.aggregator(images) #...
Hi, The vggt_to_colmap.py was merged from a pull request and personally I did not check the memory usage there. I plan to refactor it in the near future (hope near...