UniDepth icon indicating copy to clipboard operation
UniDepth copied to clipboard

About finetune on unidepthv2

Open wuqun-tju opened this issue 8 months ago • 2 comments

Hello

Thank you for your great work

I want to finetune on my own data. I have some question

  1. I found there no get_pretrain() func on decoder module
  2. I found in warm up step , the lr is constant because base_value is equal to init_value, is it right in you own training?

Image

  1. then I found some loss is up and some loss is constant, dou you have any advice

Image

I found regression loss is constant is that it used rays_gt not rays_pred, is it right?

Image 4. why logdepth plus 2.0 in decoder

Image 5. I haven't find relative translation t-u[-0.1,0.1] in the code as described as follows:

Image

I am lookingforward your reply, Thank you very much

wuqun-tju avatar Apr 08 '25 03:04 wuqun-tju

Hey, thanks a lot for your questions!

  1. If you're referring to get_params, they are obtained via the get_params function here.
  2. We actually never used a warmup. The relevant code is a leftover from earlier experiments—it’s more of "legacy dirtiness" from when I tried warmup, but I didn’t observe any consistent benefit.
  3. I'm a bit confused by the losses going to zero while the SiLog increases significantly. Regarding the camera's Regression, what you posted is correct. During training, we used the snippet under the comment LEGACY CODE FOR TRAINING, where we randomly sample between GT and predicted rays using a sort of curriculum learning with prob = 0.8 * (1 - tanh(...)) + 0.2.
  4. The +2 offset is just to improve the initialization at the beginning of training. Initially, the decoder outputs log-depth values close to zero, so adding +2 shifts the depth to around exp(2), which is a typical average depth value across both indoor and outdoor scenes. This helps avoid wasting capacity or causing large gradients on the final layer’s bias just to reach a reasonable average depth.
  5. You can find the random translation here. You're right that this class should be updated to use the new camera abstraction instead of hardcoded intrinsics—I’ll fix that!

Let me know if you have any more questions!

lpiccinelli-eth avatar Apr 21 '25 04:04 lpiccinelli-eth

Hello  你好

Thank you for your great work感谢您的出色工作

I want to finetune on my own data. I have some question我想对自己的数据进行微调。我有一些问题

  1. I found there no get_pretrain() func on decoder module我发现解码器模块上没有 get_pretrain() 功能
  2. I found in warm up step , the lr is constant because base_value is equal to init_value, is it right in you own training?我在热身步骤中发现,lr 是恒定的,因为 base_value 等于 init_value,这在你自己的训练中正确吗?

Image

  1. then I found some loss is up and some loss is constant, dou you have any advice然后我发现有些损失是上升的,有些损失是恒定的,你有什么建议吗

Image

I found regression loss is constant is that it used rays_gt not rays_pred, is it right?我发现回归损失是恒定的,因为它 rays_gt 使用而不是 rays_pred,对吗?

Image 4. why logdepth plus 2.0 in decoder4. 为什么解码器中使用 LogDepth Plus 2.0

Image 5. I haven't find relative translation t-u[-0.1,0.1] in the code as described as follows:5. 我在代码中没有找到相对翻译 t-u[-0.1,0.1],如下所述:

Image

I am lookingforward your reply, Thank you very much期待您的回复,非常感谢

Could you please share how to use a custom dataset for fine-tuning and the methods for generating the data? Thank you very much.

dxw2000 avatar Oct 24 '25 06:10 dxw2000