packnet-sfm
packnet-sfm copied to clipboard
How to interpret the reconstructed ray?
https://github.com/TRI-ML/packnet-sfm/blob/2698f1fb27785275ef847f3dbbd550cf8fff1799/packnet_sfm/geometry/camera.py#L132-L138
How to interpret the output of the reconstruct function which lifts the depthmap onto 3D using inverse intrinsic matrix? I see that it outputs a ray of size [Bx3xwxh]. I am thinking that this is X,Y,Z co-ordinates and I see that Z is same as the depthmap as it is not affected by the matrix muntiplication with K^{-1}. But, why does camera intrinsic have an affect only on the X,Y and not on Z? I find it difficult to interpret this output. It would be great if anyone gave some insight. Thanks.
You are right, the output is Bx3xHxW containing 3D coordinates for each pixel. The depth map scales each vector such that the Z coordinate is equal to the depth value, and in doing so it also scales the X and Y coordinates (essentially it selects the point in the line segment that corresponds to that depth value).
But since the grid is being scaled, are the X,Y coordinates of the output (after depth scaling) are the mapping functions of how the image grid changes as a result of camera intrinsics? If yes, I could calculate an inverse map and use it to map the depth values onto the input image. Sorry if I couldn't explain it properly. At the end, I don't want to shift my X,Y of the image to match the correct depth. I want to shift the depthmap to match my image's X,Y