spconv
spconv copied to clipboard
Questions about performance
Amazing! I have been using spconv1 before, but now I have switched to spconv2.1. It is amazing. It used to take 3 hours to train one epoch, but now it only takes 1.5 hours. And the GPU memory usage has been reduced by about 40%. But I still have some unclear questions. I wonder if you can help me answer them or give me some suggestions?
This is the data and data type passed into the model. How can I modify it to make the training more efficient?
Hi, can you share which model did you train and which profiling method did you use the to test the training time? Thanks
The model I use belongs to my senior brother, and I can’t share it with you yet. You can get the training time by checking the training process. Isn’t it possible to record the time of each epoch?
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Hi, have you solved it? I have the same problem.
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Hi, have you solved it? I have the same problem.
![]()
I didn't solve my issue, the model I tested is sparse convolution 2d.
But I have some recommendation for your code:
- First you should warm-up you GPU before measuring the time. For example, run 50 epoch on dense convolution net first, then run 100 epoch for both dense and sparse convolution, take the average training time for both results.
- Try other time measurement method, for example, torch.cuda.event.record which you can search on Google or ask ChatGPT for other method.
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Hi, have you solved it? I have the same problem.
![]()
I didn't solve my issue, the model I tested is sparse convolution 2d.
But I have some recommendation for your code:
- First you should warm-up you GPU before measuring the time. For example, run 50 epoch on dense convolution net first, then run 100 epoch for both dense and sparse convolution, take the average training time for both results.
- Try other time measurement method, for example, torch.cuda.event.record which you can search on Google or ask ChatGPT for other method.
Thanks for your quick reply!
Regarding the first point, have you tried doing this and does it work?
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Hi, have you solved it? I have the same problem.
![]()
I didn't solve my issue, the model I tested is sparse convolution 2d. But I have some recommendation for your code:
- First you should warm-up you GPU before measuring the time. For example, run 50 epoch on dense convolution net first, then run 100 epoch for both dense and sparse convolution, take the average training time for both results.
- Try other time measurement method, for example, torch.cuda.event.record which you can search on Google or ask ChatGPT for other method.
Thanks for your quick reply!
Regarding the first point, have you tried doing this and does it work?
Doesn't work for me. I have tried all methods and techniques that I know for Spconv2d. But I didn't test 3d cases. If you have any progress, please share with me, thanks.
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Hi, have you solved it? I have the same problem.
![]()
I didn't solve my issue, the model I tested is sparse convolution 2d. But I have some recommendation for your code:
- First you should warm-up you GPU before measuring the time. For example, run 50 epoch on dense convolution net first, then run 100 epoch for both dense and sparse convolution, take the average training time for both results.
- Try other time measurement method, for example, torch.cuda.event.record which you can search on Google or ask ChatGPT for other method.
Thanks for your quick reply! Regarding the first point, have you tried doing this and does it work?
Doesn't work for me. I have tried all methods and techniques that I know for Spconv2d. But I didn't test 3d cases. If you have any progress, please share with me, thanks.
In fact, I printed the running time of some modules during my model inference and found that they were not much more efficient than normal convolution. I still don't understand what the problem is.
Actually I tried to measure the training and inference time on a single sparse convolution layer in many ways, like using time.time(), cuda.event.record, pytorch profiling tool, but didn't see any improvement of actual runtime. So can I ask you what type or the name of neural network are you using ? I don't need you to share a copy with me
Hi, have you solved it? I have the same problem.
![]()
I didn't solve my issue, the model I tested is sparse convolution 2d. But I have some recommendation for your code:
- First you should warm-up you GPU before measuring the time. For example, run 50 epoch on dense convolution net first, then run 100 epoch for both dense and sparse convolution, take the average training time for both results.
- Try other time measurement method, for example, torch.cuda.event.record which you can search on Google or ask ChatGPT for other method.
Thanks for your quick reply! Regarding the first point, have you tried doing this and does it work?
Doesn't work for me. I have tried all methods and techniques that I know for Spconv2d. But I didn't test 3d cases. If you have any progress, please share with me, thanks.
Hi, I used torch.cuda.Event
to test the time and found no problem. Do you think this is the right thing to do? And did you do this before? Why is it not feasible to use the time
library?
device='cuda:0'
x_d = torch.zeros((2, 4, 1024, 1024))
x_d[0,0,0:16,0:16] += 1.
x_d = x_d.to(device)
x = SparseConvTensor.from_dense(x_d.permute(0,2,3,1))
conv_sparse = spconv.SparseConv2d(4, 4, kernel_size=3,stride=2, padding=1,bias=False, dilation=1).to(device)
bn_sparse = nn.BatchNorm1d(4, momentum=0.1).to(device)
conv_bn_relu_sparse = spconv.SparseSequential(conv_sparse, bn_sparse, nn.ReLU(inplace=True)).to(device)
conv_norm = nn.Conv2d(4, 4, kernel_size=3,stride=2, padding=1,bias=False, dilation=1).to(device)
bn_norm = nn.BatchNorm2d(4, momentum=0.1).to(device)
conv_bn_relu_norm = nn.Sequential(conv_norm, bn_norm, nn.ReLU(inplace=True)).to(device)
for i in range(10):
print("round:", i)
start_event = torch.cuda.Event(enable_timing=True)
end_event = torch.cuda.Event(enable_timing=True)
start_event.record()
encoder_output1 = conv_bn_relu_norm(x_d)
end_event.record()
end_event.synchronize()
elapsed_time_ms = start_event.elapsed_time(end_event)
print(f"conv_bn_relu_norm time: {elapsed_time_ms} milliseconds")
start_event = torch.cuda.Event(enable_timing=True)
end_event = torch.cuda.Event(enable_timing=True)
start_event.record()
encoder_output = conv_bn_relu_sparse(x)
end_event.record()
end_event.synchronize()
elapsed_time_ms = start_event.elapsed_time(end_event)
print(f"conv_bn_relu_sparse time: {elapsed_time_ms} milliseconds")