MiDaS icon indicating copy to clipboard operation
MiDaS copied to clipboard

Exact distance of image

Open Shubhamkumarroy opened this issue 6 months ago • 9 comments

i have got midas depth but i need to convert this into distance like meter and centimeter. Can anyone help me?

Shubhamkumarroy avatar Feb 15 '24 11:02 Shubhamkumarroy

The best approach may be to try using the ZoeDepth models which are built to give metric distances as an output.

Otherwise, if you know the range of the depth of the image, you can convert the midas output into true depth using the formula: True Depth = 1 / (A * normalized_midas_depth + B)

Where the variables A and B are given by:

A = (1 / min_depth) - (1/ max_depth)
B = 1 / max_depth

Here, the min_depth & max_depth refer to the minimum & maximum depth values in the image (i.e. you'd need to know something like, 'the closest point is 2 meters away, the farthest is 17 meters'. Then invert those numbers to calculate A and B). Though this approach will be sensitive to errors in the min/max depth values as well as the midas output (again, probably better to use the ZoeDepth models).

heyoeyo avatar Feb 15 '24 15:02 heyoeyo

Thank you for the given formulas. It is working. Together with several points with known distance it gives proper results.

def depth_to_real(midas_prediction, known_points):

    '''
        Transfer relative MiDaS depths to real depths with known points
        Args:
        midas_prediction: output from MiDaS
        known_points: points on image with known distances (x, y, distanse)
    '''

    # normalize midas prediction to 0...1
    midas_depth_array = midas_prediction/np.max(midas_prediction)


    if len(known_points)>=2:
        # get pairs of normalized relative and real depths
        points = np.array([(midas_depth_array[int(y), int(x)], distance) for x,y,distance in known_points])

        # solve the system of equations : 
        # relative_depth*(1/min_depth) + (1-relative_depth)*(1/max_depth) = 1/real_depth
        x = points[:,0]  # normalized relative depth
        y = 1/points[:,1]  # reversed real depth
        A = np.vstack([x, 1-x]).T

        s, t = np.linalg.lstsq(A, y, rcond=None)[0]

        min_depth = 1/s
        max_depth = 1/t

    else:
        print('Not enough known points to make real depth estimation')
        return None
    
    # align relative depth to real depth
    A = (1 / min_depth) - (1/ max_depth)
    B = 1 / max_depth
    midas_depth_aligned = 1 / (A * midas_depth_array + B)

    return midas_depth_aligned

ximader avatar Feb 16 '24 16:02 ximader

I am confused. Is there any way to extract the exact distance(in meters) of any pixel in the image? Assume I don't know any other points other than the predicted values. Can I still get the exact distance out of the image?

Rafid00 avatar Feb 26 '24 07:02 Rafid00

Is there any way to extract the exact distance(in meters) of any pixel in the image?

Metric depth models (like ZoeDepth) attempt to do this. With relative depth models (like MiDaS) you need additional information to convert the relative mapping to an absolute one.

heyoeyo avatar Feb 26 '24 14:02 heyoeyo

give me the end-to-end complete code for calculating the depth using webcam and convert the distance like meter and centimeter

jvishwa06 avatar Mar 05 '24 19:03 jvishwa06

Is there any way to extract the exact distance(in meters) of any pixel in the image?

Metric depth models (like ZoeDepth) attempt to do this. With relative depth models (like MiDaS) you need additional information to convert the relative mapping to an absolute one.

If you know the real depth (meters) for 1 pixel, would it be enough to convert the rest of the depths to real distance too?

RoyAmoyal avatar Apr 04 '24 18:04 RoyAmoyal

If you know the real depth (meters) for 1 pixel, would it be enough to convert the rest of the depths to real distance too?

Not quite, it's sort of a '2 knowns to figure out 2 unknowns' situation. You'd need to know the true depth for at least 2 pixels to be able to solve for A and B in the equation. In general though, you'd want to use many more than 2 points, since any error on those 2 pixels will lead to errors in estimating A and B. You might want to check out issue #171, where this was discussed in more detail (or check out the code from @ximader above).

That being said, if you want to try to fit using only two pixels, you can setup a system of 2 equations using the known pixels (and the equation from before) and solve it to figure out A and B. If your 2 known true depths are d1 and d2 and correspond to pixels with relative midas depths of m1 and m2 (respectively), then as far as I can tell, the parameters are given by:

Let:
  inv_d1 = 1 / d1
  inv_d2 = 1 / d2

then:

A = (inv_d2 - inv_d1) / (m2 - m1)
B = inv_d1 - m1 * A

And for clarity, I'm just getting this by re-arranging the equations:

d1 = 1 / (A * m1 + B)
d2 = 1 / (A * m2 + B)

heyoeyo avatar Apr 04 '24 22:04 heyoeyo