vggt icon indicating copy to clipboard operation
vggt copied to clipboard

Question About Image Size in Camera File After Using vggt_to_colmap

Open engrmusawarali71 opened this issue 7 months ago • 12 comments

Hello, Thank you for your outstanding work!

I'm currently working on NeRF using estimated poses, which I obtained using the vggt_to_colmap script. However, I noticed that the image size recorded in the camera file differs from the original image resolution.

Please provide some insights into why this is happening.

engrmusawarali71 avatar May 05 '25 14:05 engrmusawarali71

Hi,

I think this is because the script converts the image width to 518 pixels. You need to scale the image size and focal length back to the original resolution to achieve the intrinsic in the original resolution.

jytime avatar May 05 '25 22:05 jytime

Hi,

Thank you very much for your kind response. I performed camera calibration using a chessboard pattern and obtained intrinsics with OpenCV. I was wondering — should I use these intrinsics for training NeRF, or the ones you provided? I'd really appreciate it if you could share your intuition or recommendation on this.

engrmusawarali71 avatar May 06 '25 10:05 engrmusawarali71

I noticed that in load_and_preprocess_images, the images are cropped to 518x518. As a result, when training NeRF or 3D Gaussian models, you cannot directly use the original images as reference.

ZCWzy avatar May 10 '25 12:05 ZCWzy

@ZCWzy Thanks for your reply! I was referring to the intrinsic parameters—since the model provides different intrinsics for each image, I’m unsure how to train Nerfacto, which expects a single intrinsic matrix. That’s why I’m leaning toward using the intrinsics estimated by OpenCV, as they’re consistent across images. Just wanted to double-check if you think that’s a good approach, or if there’s a better way to handle this with Nerfacto.

@jytime kindly provide your intuition on this.

engrmusawarali71 avatar May 11 '25 10:05 engrmusawarali71

I will provide an instruction on how to use nerfstudio over the outputs.

jytime avatar May 11 '25 22:05 jytime

Dear @jytime Thank you so much for your kindness. But Is it ok? If I keep use intrinsic estimated by charuco board.

engrmusawarali71 avatar May 12 '25 10:05 engrmusawarali71

Dear @jytime Thanks you so much also provide the tutorial for how to train Guassian Splats using NerfStudio

engrmusawarali71 avatar May 12 '25 11:05 engrmusawarali71

Hi all,

I just uploaded a new file: https://github.com/facebookresearch/vggt/blob/main/demo_colmap.py

The saved output is the standard colmap sparse reconstruction and can be directly fed into nerfstudio for nerf/gau training. For example,

python demo_colmap.py --scene_dir=/YOUR/SCENE_DIR/ (you can specify use_ba or not) cd gsplat python examples/simple_trainer.py default --data_factor 1 --data_dir /YOUR/SCENE_DIR/ --result_dir /YOUR/RESULT_DIR/

jytime avatar May 22 '25 22:05 jytime

Hi all,

I just uploaded a new file: https://github.com/facebookresearch/vggt/blob/main/demo_colmap.py

The saved output is the standard colmap sparse reconstruction and can be directly fed into nerfstudio for nerf/gau training. For example,

python demo_colmap.py --scene_dir=/YOUR/SCENE_DIR/ (you can specify use_ba or not) cd gsplat python examples/simple_trainer.py default --data_factor 1 --data_dir /YOUR/SCENE_DIR/ --result_dir /YOUR/RESULT_DIR/

thank u very much! :D

ZCWzy avatar May 23 '25 06:05 ZCWzy

@jytime Thank for your demo_colmap.py. I tried to output demo_colmap.py and use it directly for 3DGS, it worked. But when I tried to render it in 4DGS, it failed.

  1. Before training the 4DGS, the pose has been processed as follows: ① w2c-->c2w; ② ​​OpenCV Coordinate --> ​​OpenGL Coordinate poses = np.concatenate([poses[:, 1:2, :], poses[:, 0:1, :], -poses[:, 2:3, :], poses[:, 3:4, :], poses[:, 4:5, :]], 1) ; ③ Merge image height weight focal and depth near, depth far.
  2. I checked the pose predicted Colmap, by vggt and the GT pose provided by DyNerf (4DGS benchmark), and found that the rotation is very similar, but the translation is very different. Is this problem still a scale problem?

Image

  1. I checked the difference between the point cloud files and found that the point cloud distribution of VGGT is more dispersed. What parameters can be adjusted to adjust it?

Image

Look forward your reply, thank you!

kk6398 avatar Jun 02 '25 03:06 kk6398

感谢您的demo_colmap.py。我尝试输出demo_colmap.py并将其直接用于 3DGS,它有效。但是当我尝试在 4DGS 中渲染它时,它失败了。

  1. 在训练4DGS之前,位姿已处理如下:(1)w2c-->c2w;(2) OpenCV 坐标 --> OpenGL 坐标;(3)合并图像高度、重量焦点和深度近、深度远。poses = np.concatenate([poses[:, 1:2, :], poses[:, 0:1, :], -poses[:, 2:3, :], poses[:, 3:4, :], poses[:, 4:5, :]], 1)
  2. 我检查了 vggt 预测的姿势 Colmap 和 DyNerf(4DGS 基准测试)提供的 GT 姿势,发现旋转非常相似,但平移却大不相同。这个问题还是规模问题吗?

Image

  1. 我检查了点云文件之间的差异,发现VGGT的点云分布更加分散。可以调整哪些参数来调整它?

Image

期待您的回复,谢谢!

Is it possible to directly output point clouds and use them for 3DGS? I noticed from the rendered image that it seems to require rotation, translation, and scaling as shown in the picture. Can you tell me how to convert it?

Image

njfan avatar Aug 27 '25 07:08 njfan

Doesn't 3dgs training require de-distorted images, and what we get images here is de-distorted?

kuaiqushangzixiba avatar Sep 10 '25 10:09 kuaiqushangzixiba