pytorch-cpp
pytorch-cpp copied to clipboard
test speed
Have you tested the speed? I get a lower speed(30ms/img) with resnet18 224*224 bachsize1
auto output_tensor = CPU(kByte).tensorFromBlob(data, {output_height, output_width, 3});
spend an abnormal time
Sorry for the late reply
@jjn037 This piece of code is slow because you transfer the data from gpu to cpu -- this is usually an expensive operation and should be slow in the original pytorch too.
Would be cool if you can compare the timing of the cpp line with a pytorch's one:
output.cpu()
and see if there is a significant difference in runtime
FYI, I have just added a file with a speed benchmark: https://github.com/warmspringwinds/pytorch-cpp/blob/master/examples/resnet_18_8s_benchmark.cpp