MiDaS icon indicating copy to clipboard operation
MiDaS copied to clipboard

[question] Any suggestions on normalizing the outputs better?

Open wes-kay opened this issue 1 year ago • 9 comments

Original: image

Output: image

This is currently an image that I took of a port hole in converted to ply, using the .pfm, and also normalized it before converting, viewed in blender.

The issue I'm having is realistically the edges should be flatter and not scale to infinity like it currently is, is there any real tricks to making the pfm a little better to work with?

Any suggestions are appreciated.

wes-kay avatar Nov 15 '23 10:11 wes-kay

disp to depth with min and max depth using this code may be helpful def disp_to_depth(disp, min_depth, max_depth):     """Convert network's sigmoid output into depth prediction     The formula for this conversion is given in the 'additional considerations'     section of the paper.     """     min_disp = 1 / max_depth     max_disp = 1 / min_depth     scaled_disp = min_disp + (max_disp - min_disp) * disp     depth = 1 / scaled_disp     return scaled_disp, depth

northagain avatar Nov 18 '23 03:11 northagain

disp to depth with min and max depth using this code may be helpful def disp_to_depth(disp, min_depth, max_depth):     """Convert network's sigmoid output into depth prediction     The formula for this conversion is given in the 'additional considerations'     section of the paper.     """     min_disp = 1 / max_depth     max_disp = 1 / min_depth     scaled_disp = min_disp + (max_disp - min_disp) * disp     depth = 1 / scaled_disp     return scaled_disp, depth

There is no 'additional considerations' in the paper I downloaded at IEEE website. Can you provide the right version? Thank you!

Besides, I'm confused about 'sigmoid output' in your code. In the dpt_beit_large_512 network, I can't find the sigmoid layer. I find the output of my network is between 900 and 10000. In this case, how can I transform output to depth?

Thank you very much!

isJHan avatar Jan 05 '24 15:01 isJHan

I have the same question. The inverse depth output is between 900 and 10000. Is the problem resolved?

thucz avatar Jan 06 '24 08:01 thucz

I have the same question. The inverse depth output is between 900 and 10000. Is the problem resolved?

Hi. I convert the output to depth map by this way. Firstly I inverse the output directly by ‘depth=1/output’, then use min-max normalize method ‘depth= (depth-depth.min())/(depth.max()-depth.min())’ for an valid depth map.

isJHan avatar Jan 06 '24 09:01 isJHan

@isJHan Thanks!

thucz avatar Jan 06 '24 09:01 thucz

@isJHan Thanks!

Today I found a bug in this procedure. When I infer on another dataset, the output can be negative or 0. So there will be a bias added to output like depth=1/(output+bias). Do you have a better way? Thanks!

isJHan avatar Jan 08 '24 04:01 isJHan

Now I also add a bias and a scale to the output like this (but not to [0, 1]):

https://github.com/KU-CVLAB/DaRF/blob/47b2d1a23d13f0d149e55cf8fd2195ec42093d1e/plenoxels/models/dpt_depth.py#L87C18-L87C18

thucz avatar Jan 08 '24 06:01 thucz

Now I also add a bias and a scale to the output like this (but not to [0, 1]):

https://github.com/KU-CVLAB/DaRF/blob/47b2d1a23d13f0d149e55cf8fd2195ec42093d1e/plenoxels/models/dpt_depth.py#L87C18-L87C18

Thanks. But how can we get alpha and beta for another dataset?

isJHan avatar Jan 08 '24 13:01 isJHan

@isJHan See the supplementary of RichDreamer in Page 13-14 (Sec A.2). It shows the general normalization methods

thucz avatar Jan 08 '24 13:01 thucz