SuGaR icon indicating copy to clipboard operation
SuGaR copied to clipboard

how to resume training

Open pknmax opened this issue 1 year ago • 1 comments

If I have output checkpoints 9000.pt, 12000.pt, 15000.pt in dir SuGaR/output/coarse/exp_name/sugarcoarse_3Dgs7000_densityestim02_sdfnorm02/

I am getting some errors like IndexError: tensors used as indices must be long, byte or bool tensors, after modifying code I want to resume training from last checkpoint?

pknmax avatar Feb 16 '24 14:02 pknmax

Hey @pknmax,

Sure, it's possible to do that, we provide several train_*.py and extract_*.py for that. Basically, the train.py script runs the full SuGaR pipeline, which is equivalent to running the following scripts one after another:

  1. train_coarse_density.py / train_coarse_sdf.py
  2. extract_mesh.py
  3. train_refined.py (builds the hybrid representation mesh + Gaussians)
  4. extract_refined_mesh_with_texture.py

For example, if you want to resume training from the checkpoint 15000.pt of the coarse optimization, you just have to restart from (2): extract_mesh.py and skip the step (1): train_coarse_*.py script. Please refer to the scripts themselves (and to train.py, which calls the same functions) to get more details about the arguments of each script.

Also, if you have to do some tests, you can use the short refinement time with train_refined.py, it will make the optimization much faster and the quality will still be ok.

Anttwo avatar Feb 17 '24 15:02 Anttwo