InstantMesh icon indicating copy to clipboard operation
InstantMesh copied to clipboard

Number of epochs before seeing a decrease in the loss

Open Kev1MSL opened this issue 1 year ago • 7 comments

Hi, I am currently training the instant-mesh model but after 3 epochs I am not seeing convergence and the loss does not seem to decrease. I am using the same config as you set (DDP strategy) and also I added grad_accumulation. I am not sure if I am doing something wrong or if I should be more patient and wait for more epochs. Not sure if I should train the instant-nerf first and then instant-mesh.

Here some plots if it helps: image image

Thanks for your help!

Kev1MSL avatar Jun 21 '24 12:06 Kev1MSL

You need to train instant-nerf first. The mesh-based rendering can only provide gradients at near-surface area, making the network hard to converge. Our mesh model is finetuned from the nerf model.

bluestyle97 avatar Jun 21 '24 14:06 bluestyle97

Alright makes sense thank you! I have tried the training of instant-nerf, however I am starting with cubes as first object rendered and I was wondering if the default code starts with pretrained weights for the LRM since I am starting with cubes?

Kev1MSL avatar Jun 21 '24 18:06 Kev1MSL

@Kev1MSL According to the paper, the training is initialized using openlrm weights. @bluestyle97 How many training steps have you run the training of instant-nerf for using your training data?

Sri-vatsa avatar Jul 15 '24 19:07 Sri-vatsa

@Kev1MSL Hello, please take a look at your dataset structure and training configuration file? This will help you better

Mrguanglei avatar Jul 16 '24 01:07 Mrguanglei

@Sri-vatsa Hello, could you take a look at the structure of your dataset and the configuration file used to train nerf? I have encountered a problem in this regard, which I cannot solve, and I need your help

Mrguanglei avatar Jul 16 '24 01:07 Mrguanglei

Alright makes sense thank you! I have tried the training of instant-nerf, however I am starting with cubes as first object rendered and I was wondering if the default code starts with pretrained weights for the LRM since I am starting with cubes?

Hi based on your graph, it seems like your training set contains only 3.5k instances, is that true? The paper says 270k instances, so do you know how to curate the dataset correctly for training instantnerf? Thanks a lot!

HaFred avatar Aug 09 '24 06:08 HaFred

You need to train instant-nerf first. The mesh-based rendering can only provide gradients at near-surface area, making the network hard to converge. Our mesh model is finetuned from the nerf model.

Does fine-tuning the InstantMesh require first fine-tuning the NeRF model, even if the my goal is solely to improve the performance of the Mesh Model?

Jinyiyi3 avatar Mar 27 '25 03:03 Jinyiyi3