vggt
vggt copied to clipboard
Scale Issues in Model Normalised Space
I should like to enquire whether the scaling applied to the normalised space predicted by the model is consistent across both the camera pose's height and the translation transformation. Specifically, whether a single scale factor can be applied to restore the entire normalised space to its corresponding scale in the real world.
Could there be discrepancies in the scale of the camera pose height and translation predicted by the model? If multiple cameras are predicted in a sequence, is the scale consistent for each prediction?