Jianyuan Wang comments

Results 238 comments of


                                            Jianyuan Wang

Evaluating the pose on ScanNet

It seems the api used in https://github.com/VladimirYugay/vggt_inference/blob/main/vggt_eval_scannet.py has already considered this. Its source is from https://github.com/MichaelGrupp/evo/blob/86f52ade6da8cc4749c6170b1d2771ea1e0f1c66/evo/main_ape.py#L42, as ``` def ape(traj_ref: PosePath3D, traj_est: PosePath3D, pose_relation: metrics.PoseRelation, align: bool = False, correct_scale:...

Is it possible to process 1000+ images on a 80G GPU?

Hi @y6216886 , We did not implement this ourselves, but you can (1) Run VGGT over different image subsets and then align all of them. (2) Distribute the inference over...

Is it possible to process 1000+ images on a 80G GPU?

Hi @bidbest , The predictions from VGGT exist within normalized coordinate spaces, but each subset resides in its own separate normalized space. Therefore, while you can assume these spaces are...

Is it possible to process 1000+ images on a 80G GPU?

Hi yes I think it is also possible to use the camera poses to find the scale, although did not try it myself before. If we can assume different scenes...

onnx&tensorRT model

Hi, we haven't tried it ourselves

how to improve speed on RTX3090

Hi, Starting from PyTorch 2.2, its built-in `scaled_dot_product_attention` function has integrated support for FlashAttention v2, so there’s no need to install FlashAttention v2 manually. If you’re looking for faster inference,...

Directly input camera params into the model

Hi @ZhangGongjie , Thanks for the interest! Yes we are testing this feature, and should be included in our next version. Basically it requires to finetune a model to accommodate...

Directly input camera params into the model

It was trained on 64 GPUs for 1-2 days , so approximately 1k-2k GPU hours.

Can I extract the intrinsic and extrinsic parameters of the camera frame by frame from a video? I don't need point cloud data.

Hi, If you only want intrinsic and extrinsic parameters, you can just do: ``` with torch.no_grad(): with torch.cuda.amp.autocast(dtype=dtype): images = images[None] # add batch dimension aggregated_tokens_list, ps_idx = model.aggregator(images) #...

OOM issues with vggt-2-colmap

Hi, The vggt_to_colmap.py was merged from a pull request and personally I did not check the memory usage there. I plan to refactor it in the near future (hope near...