Is there a mismatch between the provided training configuration and train.py script, and the 'Implementation Details' section of the UniDepthV2 paper?
Hi,
There might be a mismatch in the schedulers configuration between the config files and the 'Implementation Details' section of the UniDepthV2 paper.
As per the paper:
However, as per the training configuration (for example: config_v2_vits14.json), the training parameters are set as: (mentioning only the ones relevant to this issue)
"training": {
"n_iters": 300000,
"lr": 1e-4,
"lr_final": 1e-6,
"lr_warmup": 1.0,
"wd": 0.1,
"wd_final": 0.1,
"warmup_iters": 75000
}
Should the parameters be changed to below to match the implementation details mentioned in the paper:
"training": {
"n_iters": 300000,
"lr": 5e-5, # considering the learning rate in the paper
"lr_final": 5e-6,
"lr_warmup": 1.0,
"wd": 0.1,
"wd_final": 0.01, # assuming weight decay scheduling to one-tenth
"warmup_iters": 90000 # 30% of 300k total iterations
}
Additionally, scheduler_wd must also change to reflect the same:
scheduler_wd = CosineScheduler(
optimizer,
key="weight_decay",
init_value=config["training"]["wd"],
base_value=config["training"]["wd"],
final_value=config["training"]["wd_final"],
warmup_iters=config["training"]["warmup_iters"], # instead of 0
total_iters=config["training"]["n_iters"],
step_init=step - 1,
)
Just curious if the released models were trained using the configurations mentioned in the paper, or do you advice to follow the configurations provided on the repo as a headstart and then further tune as per individual use cases?
Thank you!