G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
InternRobotics
Visual Spatial Tuning
Yangr116