Problem solved.
Hello, I am an undergraduate who is studying because I recently became interested in depth estimation.
When I made it through a metric depth estimation model, the results came out like pictures.
As the person who made this code said, I tried to initialize the initial encoder through evaluation, but since I don't know the computer language well and have only one GPU, there is a problem, so I'm using a pre-trained model without evaluating it separately.
Does the bar chart on the right here represent the actual depth?? When I took the picture myself, I took it at about 3m, but it came out almost 30m, so I'm curious.
Also, I got the ROI coordinates here through YOLO, how can I actually extract the depth with this map?
Thank you for reading. Have a nice day.
Oh.. what I want to catch is the car. Thank you for your help.
I solved it. Thank you for your help.
Hi @BeginerYJH, may I know how you actually solved the distance problem you mentioned. Thanks.
Hi @BeginerYJH, may I know how you actually solved the distance problem you mentioned. Thanks.
Dear @Kaiz0506 As a beginer in Coding, I can not ensure my code is answer. please consider this.
I use the code in Metric depth which is written by creator. And I add this code to make meter be mm and use scale factor that I got it by comparing gt data and the result of model. In my case, Using scale factor 0.3395 makes it accurate. scale_factor = 0.3395 depth_map_mm_real = (depth_map * scale_factor * 1000).astype(np.uint16)
Hi @BeginerYJH, may I know how you actually solved the distance problem you mentioned. Thanks.
@Kaiz0506 Also, I made ROI using this code.
import numpy as np import xml.etree.ElementTree as ET
import numpy as np from PIL import Image import xml.etree.ElementTree as ET
depth_map_path = "your depth map path" roi_xml_path = "your xml path"
tree = ET.parse(roi_xml_path) root = tree.getroot() roi_coords = [] for obj in root.iter('object'): bndbox = obj.find('bndbox') xmin = int(bndbox.find('xmin').text) ymin = int(bndbox.find('ymin').text) xmax = int(bndbox.find('xmax').text) ymax = int(bndbox.find('ymax').text) roi_coords.append((xmin, ymin, xmax, ymax))
depth_map_image = Image.open(depth_map_path)
depth_map_array = np.array(depth_map_image)
for idx, (xmin, ymin, xmax, ymax) in enumerate(roi_coords): roi_depth = depth_map_array[ymin:ymax, xmin:xmax] mean_depth = np.mean(roi_depth) min_depth = np.min(roi_depth) max_depth = np.max(roi_depth) print(f"ROI {idx} depth statistics:") print(f"Mean depth value: {mean_depth}") print(f"Min depth value: {min_depth}") print(f"Max depth value: {max_depth}")
Hi @BeginerYJH
Thank you very much for your help. But I do have something that need some clarification.
May I know why it need to be in mm, and also where can I find the code usage, as you mentioned it is from the author's code.
Hello, I would like to ask how this metric depth is implemented, I always fail following the tutorial