Metric3D
Metric3D copied to clipboard
Fine tune on custom dataset
Hi,
How to fine tune the metric3d on a custom dataset as I want to check the improvement in generalization of this model. For the custom dataset the following information is available - the gt depth maps, intrinsic and corresponding camera images. Also what would be expected gpu memory required for finetuning the model ?
Thank you,
Hi, the provided GT depths, intrinsics, RGB images can satisfy the fine-tuning. You can use 8*4090 for fine-tuning, but the more gpus faster training.
So, I can follow the same steps as shown in training's readme to fine tune the model, i.e. setting up the dataset as dict and run the training script with parameter "--load-from" set to the path of the pretrained model. Are there additional steps ?
Considering I have a dataset similar to Kitti, what is the approximate number of training data required to fine tune ?
And, by " 8*4090 ", do you mean 8 x rtx 4090 gpus ?
What is "depth scale" in the json labels ?
8*4090 is enough for finetuning. 'depth scale' in JSON is used to recover real metric. For example, kitti dataset save the depth map in a 16-bit png file and they scale the real depth with 256 to perserve the numerical precision. Thus in our json, we create this to scale the loaded depth file to recover the metric.
Hey @testingshanu , may I ask how your configuration file of your custom dataset looks like? I'm stuck in a similar situation
Hey @testingshanu , may I ask how your configuration file of your custom dataset looks like? I'm stuck in a similar situation
Not yet. If you are able run it successfully, please let me know the config and depth scale that you used.
Hey @testingshanu , may I ask how your configuration file of your custom dataset looks like? I'm stuck in a similar situation
Not yet. If you are able run it successfully, please let me know the config and depth scale that you used.
Hi, I got it to work. I used the depth scale same as kitti (seen in config and in issue below). Before training I saved my data the same way as kitti does:
def save_as_uint16(depth, filename):
"""
depth is a 2D numpy array containing metric depth as float in meter
"""
# Handle invalid values
depth[np.isnan(depth) | np.isinf(depth)] = 0
# Convert depth to 16-bit unsigned integer format
depth_uint16 = (depth * 256.).astype(np.uint16)
# Save depth as PNG
Image.fromarray(depth_uint16).save(filename)
here is more info and the config also: #105
Hey @testingshanu , may I ask how your configuration file of your custom dataset looks like? I'm stuck in a similar situation
Not yet. If you are able run it successfully, please let me know the config and depth scale that you used.
Hi, I got it to work. I used the depth scale same as kitti (seen in config and in issue below). Before training I saved my data the same way as kitti does:
def save_as_uint16(depth, filename): """ depth is a 2D numpy array containing metric depth as float in meter """ # Handle invalid values depth[np.isnan(depth) | np.isinf(depth)] = 0 # Convert depth to 16-bit unsigned integer format depth_uint16 = (depth * 256.).astype(np.uint16) # Save depth as PNG Image.fromarray(depth_uint16).save(filename)
here is more info and the config also: #105
Did you also manage to fine tune both kitti and your own dataset ? And did it perform well on your data ?