SpatioTemporalSegmentation
SpatioTemporalSegmentation copied to clipboard
Training time
Hi,
Thanks for sharing the training code. Does the code work for multiple GPUs? If not, how long it takes to get the reported SOTA performance model?
Thanks very much
Sorry for the late reply.
I haven't measured the entire training time as the server kicks me out after the max wall time of the SLURM. However, each iteration takes about 7.5 seconds on Titan RTX with batch size 9 with 2cm voxel size Mink16UNet34C (42 layers deep network).
We are currently working on making this even faster on the next version of the MinkowskiEngine, which will be released soon.
I posted an entire training log on https://github.com/chrischoy/SpatioTemporalSegmentation/issues/8
In sum, started 09/11 14:59:47 ended 09/16 15:34:33 for 60k and scannet v2 validation every 1k which takes about 7 min each total 420 min.
Thanks very much for the kind reply! BTW, is the log for 5cm voxel available, since it seems too long to train a model for 2cm voxels..
@chrischoy
Could this code use multi gpu?
I added this to the main.py:
if torch.cuda.device_count() > 1:
print("Let's use", torch.cuda.device_count(), "GPUs!")
model = torch.nn.DataParallel(model)
model = model.to(torch.device("cuda:0"))
but error occurs: RuntimeError: Caught RuntimeError in replica 0 on device 0.
Thank you!