UniDepth icon indicating copy to clipboard operation
UniDepth copied to clipboard

project_points() usage

Open David-Yan1 opened this issue 1 year ago • 5 comments

Hi! I was trying to use project_points() to get back a depth map from a modified point cloud. I noticed that there was a black grid pattern on the outputted depth map - not sure if I did something wrong or if it's intended. Below is a depth prediction (predictions["depth"]) for an image, and then the project_points() depth map using predictions["points"].

image image

David-Yan1 avatar May 02 '24 07:05 David-Yan1

Are you resizing the pointmap? That black grid is due to missing points falling in those regions and it could be due to some nearest interpolation

lpiccinelli-eth avatar May 02 '24 08:05 lpiccinelli-eth

Ah, I believe I was using the incorrect intrinsics in the original. However, the depth is still noisy. Here is a minimal script to reproduce

image image
import open3d as o3d
from PIL import Image
import numpy as np
import torch
from unidepth.utils import colorize
from unidepth.models import UniDepthV1
from unidepth.utils.visualization import save_file_ply
from unidepth.utils.geometric import project_points


model = UniDepthV1.from_pretrained("lpiccinelli/unidepth-v1-vitl14")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

rgb = np.array(Image.open('assets/demo/rgb.png'))
rgb_torch = torch.from_numpy(rgb).permute(2, 0, 1)
predictions = model.infer(rgb_torch)
depth1 = predictions["depth"].squeeze().cpu().numpy()

depth_pred_col = colorize(depth1, vmin=0.01, vmax=10.0, cmap="magma_r")
Image.fromarray(depth_pred_col).save("original_depth.png")

H, W = rgb.shape[:2]
torch_intrinsic1 = predictions["intrinsics"] 
pcd =  predictions["points"].view(1, 3, -1)  # Shape: (B, 3, H,W) -> (B, 3, HW)
pcd = pcd.transpose(1, 2)  # Shape: (B, HW, 3)

depth = project_points(pcd, torch_intrinsic1, (H,W))
depth_pred_col = colorize(depth.squeeze().cpu().numpy(), vmin=0.01, vmax=10.0, cmap="magma_r")
Image.fromarray(depth_pred_col).save("projected_depth.png")

David-Yan1 avatar May 02 '24 16:05 David-Yan1

Thank you for diving deeper! We have never tried this sanity check :sweat_smile: . There are different possible explanations:

  1. The project points may have some problem: some pixels are re-projected on other pixels, thus creating that pepper noise, it may be something related to rounding.
  2. The generate_rays is slightly shifting points (hence like rounding effect)
  3. spherical_zbuffer_to_euclidean presents some numerical errors, we will try with float64.
  4. The interpolation used for depth creates that inconsistency. However, the quasi-random nature of that noise makes me propend for the first/second option.

We will investigate better and try to solve, or at least explain, the source of the problem.

lpiccinelli-eth avatar May 02 '24 17:05 lpiccinelli-eth

Figured it out! I was rewriting my own projection script and the noise appeared when i used round() instead of using int(). So I believe

# To pixels (rounding!!!), no int as it breaks gradient
    points_2d = points_2d.round()

should be

 # To pixels (rounding!!!), no int as it breaks gradient
 points_2d = points_2d.int()

This fixes the noise as seen below (the comment seems to be wrong).

image image

David-Yan1 avatar May 02 '24 19:05 David-Yan1

Thank you for your comment, your suggestion has been included in the PR #38 which includes V2 release, too

lpiccinelli-eth avatar May 04 '24 14:05 lpiccinelli-eth