dinov2
dinov2 copied to clipboard
Question about binning for depth prediction (DINOv2 + DPT Head)
Hi,
Thanks for your work, it's truly remarkable. I noticed that while training/running inference using the DPT head, binning is not supported by default.
https://github.com/facebookresearch/dinov2/blob/6a6261546c3357f2c243a60cfafa6607f84efcb7/dinov2/eval/depth/models/decode_heads/decode_head.py#L157: self.classify
is set to False. I do get reasonable results using the same:
I want to get a probability distribution for depth over 256 bins. To do this, I set
self.classify = True self.n_bins = 256
Using these settings, all my logits are set to 1, and get a black image for the rendered depth.
Since logit
has only 1 channel, the below code divides all spatial values in logit by itself: https://github.com/facebookresearch/dinov2/blob/6a6261546c3357f2c243a60cfafa6607f84efcb7/dinov2/eval/depth/models/decode_heads/decode_head.py#L170
As a result, in the contraction (sum of product) step below, all values in output
are the same: https://github.com/facebookresearch/dinov2/blob/6a6261546c3357f2c243a60cfafa6607f84efcb7/dinov2/eval/depth/models/decode_heads/decode_head.py#L177
I am using the render_depth
function (https://github.com/facebookresearch/dinov2/blob/main/notebooks/depth_estimation.ipynb) to visualize the depth maps.
Am I doing something wrong here, or are there any other steps I need to be doing to fix this? Is it reasonable to expect a similar depth map when self.classify
is True? Thank you!