EcoDepth Regarding training a new model from scratch without using a trained pre-trained model

Hello, first of all, it is a great honor for me to see the article published by your team, and I benefited a lot after reading it. When I was learning from your code, I hope that the pre-trained model can not be applied to the training. In addition to modifying the parameters in the config, as for the code part of the frozen ViT model proposed by you, do you need to unfreeze the model before training? Because I have encountered this kind of problem during training, and the hidden_size problem appears in the training terminal, and there is no hidden_size parameter in your config, could you please help me? I will be very grateful.

Mar 21 '25 09:03 wanshishuns

Hi @wanshishuns. We're delighted that you found our work helpful!

The VIT_MODEL which we have defined in EcoDepth/model.py google/vit-base-patch16-224, is built for the ImageNet dataset. The ImageNet dataset has 1000 classes and what this means is that the ViTForImageClassification model will produce a 1000 dimensional logits vector, one logit for each class.

Since we assume that the number of labels are 1000 as can be seen in the line 128 of the model.py file, we do not expect any issue of mismatched sizes (as has been verified at our end by successfully running inference and training). Could you please give some more details as to how you're trying to run the code (ie. whether you have done any change)?

Regarding the freezing/unfreezing issue, we don't unfreeze the ViT model for training since we would like to preserve the semantic knowledge alreadly captured by the model. Rather, we use the rest of the CIDE module (proposed by us) to produce scene embeddings used for conditioning the SD pipeline.

We would be happy to help in case there are any other doubts/issues.

Mar 21 '25 12:03 Aradhye2002

I am very honored to receive your reply so soon.

I already know about freezing, so there is no need to unfreeze during training. I did unfreeze during training. For ViT, I will try to train without unfreezing. At the same time, do you need to comment out the assert not args.train_from_scratch on line 190 under model.py? I don't know whether the above operation is correct or not (in addition, I found that the number of training rounds in our config is 25, is it ok to only need 25 training rounds?)

If possible, I hope you can give me some advice. Please forgive me if you delay your time. Thank you very much for your help.

Mar 22 '25 02:03 wanshishuns