Jianyuan Wang

Results 238 comments of Jianyuan Wang

Hi @yaseryacoob , As discussed above, the recommendation of a 1x1 aspect ratio isn't necessarily a general rule and might be optimal only in specific scenarios. From my personal experiments,...

I may see the problem. During training, all images are trained with a width of 518. So the model will predict the filed of view based on the ratio between...

Hi @yaseryacoob , Thanks for the detailed discussion—the example looks great! If you’re aiming for multi-view consistency while preserving high resolution, one possible solution is to use our predicted depth...

HI @ChenYutongTHU , Are you using the undistorted or original version of ETH3D? The original version of ETH3D was captured by non-pinhole camera and has noticeable distortion. You need to...

Hi yes the depth prediction is relative depth. The cameras are cam_from_world, so world_to_cam. For aligning to the ground-truth scale, the github issue below contains the code we use for...

Hi, Please check [here](https://github.com/facebookresearch/vggt/blob/c4b5da2d8592a33d52fb6c93af333ddf35b5bcb9/demo_gradio.py#L212), you can simply save the prediction dictionary as: with torch.no_grad(): predictions = run_model(target_dir, model) # Save predictions prediction_save_path = os.path.join(target_dir, "predictions.npz") np.savez(prediction_save_path, **predictions) # Handle None...

Yeah it is not prepared, though you could try to implement it based on our dirty code.

Hi @buaacyw @abidlabs I am making a demo which is related to this. May I ask if there is any update on this? What I am doing now is using...

Hi for the depth prediction, it is okay to use relu although we found exp generally works better. For the point cloud prediction, since point can be (-inf, inf), relu...

Hi, I am not sure which codebase you are using, but ```align=True/False``` sounds like whether conducting alignment to the predicted poses. Our predicted camera poses are in a normalised unit...