libdeepvac
libdeepvac copied to clipboard
example 里带了cuda头文件,在纯CPU环境下编译不通过
RT,没有安装CUDA的机器,设置USE_CUDA=OFF, BUILD_ALL_EXAMPLES = OFF,因为仍然还是会编example,编译时报找不到头文件错误,具体:
In file included from /root/installs/libdeepvac/examples/src/test_resnet_benchmark.cpp:11:
/usr/libtorch/include/c10/cuda/CUDAStream.h:6:10: fatal error: cuda_runtime_api.h: No such file or directory
#include <cuda_runtime_api.h>
似应根据BUILD_ALL_EXAMPLES 和USE_CUDA的状态决定要build哪些examples?
also 删除examples编译成功,后续使用的时候仍然会报一些undefined reference的错(因为我是先将模型推理相关代码编译成.so,再链接这个.so做推理,所以报undefined reference)
like:
../../lib/libbit.so: undefined reference to `vtable for torch::autograd::AutogradMeta'
../../lib/libbit.so: undefined reference to `c10::GradMode::is_enabled()'
../../lib/libbit.so: undefined reference to `vtable for torch::jit::Method'
../../lib/libbit.so: undefined reference to `c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, char const*)'
编译.so时,使用的是自己装的opencv库,其他使用deepvac的版本,由于是CPU,只用了这些:
target_include_directories(bit PUBLIC ${DEEPVAC_LIBTORCH_INCLUDE_DIRS})
target_link_libraries(bit ${DEEPVAC_LIBTORCH_CPU_LIBRARIES} ${OpenCV_LIBS})
有空麻烦看看呢~
需要用deepvac的原因是,我发现采用https://pytorch.org/cppdocs/installing.html 所给出的pre-built libtorch,推理速度要比pytorch慢一倍左右,感觉是libtorch.so所采用的后端问题,并且打印了时间,主要的耗时开销确实是模型推理这一步一下子速度慢了一倍,想手动搞一个openblas或MKL变异的libtorch验证这个想法。
抱歉,目前这个库已经不维护了