InstantMesh Number of epochs before seeing a decrease in the loss

Hi, I am currently training the instant-mesh model but after 3 epochs I am not seeing convergence and the loss does not seem to decrease. I am using the same config as you set (DDP strategy) and also I added grad_accumulation. I am not sure if I am doing something wrong or if I should be more patient and wait for more epochs. Not sure if I should train the instant-nerf first and then instant-mesh.

Here some plots if it helps:

Thanks for your help!

Jun 21 '24 12:06 Kev1MSL

You need to train instant-nerf first. The mesh-based rendering can only provide gradients at near-surface area, making the network hard to converge. Our mesh model is finetuned from the nerf model.

Jun 21 '24 14:06 bluestyle97

Alright makes sense thank you! I have tried the training of instant-nerf, however I am starting with cubes as first object rendered and I was wondering if the default code starts with pretrained weights for the LRM since I am starting with cubes?

Jun 21 '24 18:06 Kev1MSL

@Kev1MSL According to the paper, the training is initialized using openlrm weights. @bluestyle97 How many training steps have you run the training of instant-nerf for using your training data?

Jul 15 '24 19:07 Sri-vatsa

@Kev1MSL Hello, please take a look at your dataset structure and training configuration file？ This will help you better

Jul 16 '24 01:07 Mrguanglei

@Sri-vatsa Hello, could you take a look at the structure of your dataset and the configuration file used to train nerf? I have encountered a problem in this regard, which I cannot solve, and I need your help

Jul 16 '24 01:07 Mrguanglei

Alright makes sense thank you! I have tried the training of instant-nerf, however I am starting with cubes as first object rendered and I was wondering if the default code starts with pretrained weights for the LRM since I am starting with cubes?

Hi based on your graph, it seems like your training set contains only 3.5k instances, is that true? The paper says 270k instances, so do you know how to curate the dataset correctly for training instantnerf? Thanks a lot!

Aug 09 '24 06:08 HaFred

You need to train instant-nerf first. The mesh-based rendering can only provide gradients at near-surface area, making the network hard to converge. Our mesh model is finetuned from the nerf model.

Does fine-tuning the InstantMesh require first fine-tuning the NeRF model, even if the my goal is solely to improve the performance of the Mesh Model?

Mar 27 '25 03:03 Jinyiyi3