gaussian-splatting
gaussian-splatting copied to clipboard
Training crashes after 7000
hi,
it gets to 7000 steps, outputs ^C
. Doesn't save a point cloud either
I guess, it could be OOM.
I also get this, not sure if OOM. It seems to only happen around the time iter 7000 saves.
I was even running it in a docker container and it crashed the host machine.
@nivibilla did you ever figure out what was causing this?
Similar issue. showed killed after 7000 iterations for db/drjohnson But tandt/train works fine.
I had the same error and it was due to OOM. When saving Gaussians there is a spike in CPU RAM usage. You're training with 423 images in Colab so I'm guessing the RAM consumption was already high. When it tried to save Gaussians, consumption must have spiked and caused an OOM.
The quick fix is to not save Gaussians at iteration 7000 and avoid the spike. Only save at 30,000 iterations (or whatever your last iteration is) using the --save_iterations 30000
argument to train.py
. However the spike caused when saving at 30,000 may cause it to fail.
The better fix is given in #667. It decreases CPU RAM consumption and prevents this.
I also ran into the same issue. In my case it was not OOM-related, though. I was able to solve the problem by changing the line
elements[:] = list(map(tuple, attributes))
of save_ply
in scene/gaussian_model.py
, to an explicit loop:
for i in range(len(elements)):
elements[i] = tuple(attributes[i])
.