Model Degradation During Fine-Tuning on Mesh Representation with Depth and Normal Maps
I'm experiencing a degradation in model performance when fine-tuning with mesh representation (depth and normal maps). The model initially works well when fine-tuned on the NeRF representation but begins to degrade after a few steps when fine-tuning with mesh-based training. This issue affects depth, normal maps, and the overall 3D reconstruction quality, as seen in the attached training logs.
It seems like the loss for geometry-related aspects (depth, normals) might not be contributing adequately, or the SDF regularization is too aggressive, leading to collapsing geometry. I have tested to remove the loss_reg and the error has disappeared, so the problem is within the calculation of the flexicubes regularization loss.
Any insights or suggestions would be greatly appreciated!
I'm experiencing a degradation in model performance when fine-tuning with mesh representation (depth and normal maps). The model initially works well when fine-tuned on the NeRF representation but begins to degrade after a few steps when fine-tuning with mesh-based training. This issue affects depth, normal maps, and the overall 3D reconstruction quality, as seen in the attached training logs.
![]()
![]()
![]()
It seems like the loss for geometry-related aspects (depth, normals) might not be contributing adequately, or the SDF regularization is too aggressive, leading to collapsing geometry. I have tested to remove the loss_reg and the error has disappeared, so the problem is within the calculation of the flexicubes regularization loss.
Any insights or suggestions would be greatly appreciated!
Hello, I met the same problem as you, have you solved it?
Hi, how did you generate the normal maps?
Hello, I also encountered the same issue when fine-tuning InstantMesh - the results at step=0 are decent, but the quality degrades as training progresses, even leading to reconstruction collapse. I tried freezing partial parameters and adjusting learning rates, but without success. I think your suggestion about dropping the reg loss is a promising direction.
Could you share more details about your approach? For instance:
Did you perform full fine-tuning or freeze specific components (e.g., encoder, decoder, latent space)? How did you modify the regularization loss - by completely removing certain terms or adjusting weight coefficients? Were there any other key modifications to the training pipeline (optimizer settings, gradient clipping, etc.)? Any implementation details or empirical observations would be greatly appreciated!
