FocusOnDepth changes to the patch embedding layer

Hi, I am trying to make use of the DPT model, however I would like to make some changes to the patch embedding layer. You have created the model using timm.create_model(), however you also have some commented out code. Have you used that code instead of the timm.create_model() and did it work the same us the original model? Do you have any tips on how I can modify the model code?

Apr 16 '22 09:04 lidiatekeste2312

Hi, So yes, initially we tried to make our very own encoder, and it was working pretty well, the reason why we are using timm is that the ViT takes a looot of time to train, so we decided to use a pretrained-model (the architecture is not 100% the same). You can bring changes to the patch embedding layer by uncommenting our code, and if you do that, you will have to train from scratch and you should put the same learning_rate for all the model (3e-4 ?). Another possibility is to use the current model (timm + pretrained weights) and to manually replace the layers/modules.

Apr 17 '22 12:04 antocad

Thank you so much for the info. It helps a lot. And I also wanted to know about the metrics. Is there a way to calculate the metrics as in the paper, RMSE and the others? I need to get the metrics to deduce some insight.

Apr 21 '22 16:04 lidiatekeste2312