spconv
spconv copied to clipboard
spconv could not take advantage of gpu acceleration
My test code is as follows:
import torch
import torch.nn as nn
import time
import spconv.pytorch as spconv
x = torch.zeros(64, 16, 124, 124, dtype=torch.float16).cuda()
for i in range(10):
x[0 , 0, i, 0]=1
x1 =x.to(dtype=torch.float32)
cv1 = nn.Conv2d(16, 16, 3, 1, 1,).half().cuda()
cv2 = nn.Conv2d(16, 16, 3, 1, 1,).cuda()
cv3 = spconv.SubMConv2d(16, 16, 3, 1, padding=1, indice_key="asd", algo=spconv.ConvAlgo.Native).half().cuda()
cv4 = spconv.SubMConv2d(16, 16, 3, 1, padding=1, indice_key="asd", algo=spconv.ConvAlgo.Native).cuda()
s= x.permute(0,2,3,1)
s = spconv.SparseConvTensor.from_dense(s)
s1= x1.permute(0,2,3,1)
s1 = spconv.SparseConvTensor.from_dense(s1)
for i in range(10):
a = time.time()
y1 = cv3(s)
b = time.time()
print(b-a)
for i in range(10):
a = time.time()
y1 = cv4(s1)
b = time.time()
print(b-a)
# 在gpu能力较差的情况下
# 原始卷积
# FP16:
# e1: 0.022843360900878906
# e last: 7.104873657226562e-05
# FP32:
# e1: 0.0012712478637695312
# e last: 8.654594421386719e-05
# spconv
# FP16:
# e1: 0.07234716415405273
# e last: 0.0004432201385498047
# FP32:
# e1: 0.0012712478637695312
# e last: 0.00042891502380371094
However, when I tested with the cpu, the test result show that the GPU does not achieve acceleration:
# CPU下运行
# spconv
# FP16:
# e1: 0.08111023902893066
# e last: 0.00044608116149902344
# FP32:
# e1: 0.0016925334930419922
# e last: 0.00042366981506347656
Did I make a mistake calling the test? I sincerely hope to get an answer. Thank you for your help.
Supplementary description:
- The above results are run on a laptop, and this situation also occurs when the GPU is A6000.
- I changed algo, but the situation still occurs under the default algo