TensorRT
TensorRT copied to clipboard
How to build from sources on Windows
❓ Question
How shall I edit the WORKSPACE file in order to build tag 0.1.0 from sources on Windows?
What you have already tried
-
I successfully did the build from sources process for Jetson Xavier AGX, see: https://github.com/NVIDIA/TRTorch/issues/222
-
Based on the material that I was already had from the Jetson process I tried to do the same for my Windows by editing the WORKSPACE based on my Windows setup. I changed all required new_local_repository arguments of the cuda, torch, cudnn and tensorrt based on my Windows installations
-
Activate the following command: bazel build //:libtrtorch
The following errors report was generated:
INFO: Repository rules_python instantiated at:
no stack (--record_rule_instantiation_callstack not enabled)
Repository rule git_repository defined at:
C:/users/General/_bazel_General/zs4npqzu/external/bazel_tools/tools/build_defs/repo/git.bzl:195:18: in
Environment
Build information about the TRTorch compiler can be found by turning on debug messages
- PyTorch Version (e.g., 1.0): 1.6
- CPU Architecture: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2592 Mhz, 4 Core(s), 8 Logical Processor(s)
- OS (e.g., Linux): Windows
- How you installed PyTorch (
conda
,pip
,libtorch
, source): pip3 - Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version: 3.6.8
- CUDA version: 11.0
- GPU models and configuration: Quadro M2000M
- Any other relevant information: TensorRT 7.2.1, CuDNN 8.0.1
Additional context
I have a good experience with TensorRT development on my Windows setup so I know that from NVIDIA libraries setup point of view everything should be OK
Windows support is unofficial and pretty flaky right now. I think this particular error is because git repositories in bazel dont work great on windows. The quick work around would be to remove rules python and the repositories that use it (the python tests) from workspace
Thank you very much @narendasan ,
I edited the WORKSPACE file as you recommended and the Python problems described above were disappeared.
I also edited the following:
-
\third_party\cudnn\local\BUILD: Was: ":windows": "bin/cudnn64_7.dll", #Need to configure specific version for windows Is: ":windows": "bin/cudnn64_8.dll", #Need to configure specific version for windows
-
\third_party\tensorrt\local\BUILD: As described in PR https://github.com/NVIDIA/TRTorch/pull/190
-
TRTorch-0.1.0.bazelrc: Was: build --cxxopt="-fdiagnostics-color=always" build --cxxopt='-std=c++14' Is: build --cxxopt='/diagnostics:caret' build --cxxopt='/std:c++14'
-
TRTorch-0.1.0\WORKSPACE:
-
Mask all Python rules
-
Mask all http_archive
-
UnMask all new_local_repository
-
Map cudnn and tensorrt to my local installations directories new_local_repository( name = "cudnn", path = "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.0", build_file = "@//third_party/cudnn/local:BUILD" )
new_local_repository( name = "tensorrt", path = "c:/TensorRT-7.2.1.6", build_file = "@//third_party/tensorrt/local:BUILD" )
Now compilation is working well but new Linking error is raised (Bold below):
INFO: Analyzed target //:libtrtorch (36 packages loaded, 2125 targets configured).
INFO: Found 1 target...
INFO: From Compiling core/util/trt_util.cpp:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.24.28314\include\numeric(35,26): warning C4244: '=': conversion from '_Ty' to '_Ty', possible loss of data
with
[
_Ty=int64_t
]
and
[
_Ty=int
]
_Val = _Reduce_op(_Val, _UFirst);
^
core/util/trt_util.cpp(63): note: see reference to function template instantiation '_Ty std::accumulate<const int32_t,int,std::multiplies<int64_t>>(const _InIt,const _InIt,_Ty,_Fn)' being compiled
with
[
_Ty=int,
_InIt=const int32_t *,
_Fn=std::multiplies<int64_t>
]
return std::accumulate(d.d, d.d + d.nbDims, 1, std::multiplies<int64_t>());
INFO: From Compiling core/conversion/evaluators/aten.cpp:
core/conversion/evaluators/aten.cpp(38,1): warning C4805: '==': unsafe mix of type 'int64_t' and type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(38,1): warning C4805: '==': unsafe mix of type 'double' and type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(38,1): warning C4805: '==': unsafe mix of type 'bool' and type 'int64_t' in operation
);
^
core/conversion/evaluators/aten.cpp(38,1): warning C4805: '==': unsafe mix of type 'bool' and type 'double' in operation
);
^
core/conversion/evaluators/aten.cpp(51,1): warning C4805: '!=': unsafe mix of type 'int64_t' and type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(51,1): warning C4805: '!=': unsafe mix of type 'double' and type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(51,1): warning C4805: '!=': unsafe mix of type 'bool' and type 'int64_t' in operation
);
^
core/conversion/evaluators/aten.cpp(51,1): warning C4805: '!=': unsafe mix of type 'bool' and type 'double' in operation
);
^
core/conversion/evaluators/aten.cpp(64,1): warning C4804: '<': unsafe use of type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(77,1): warning C4804: '>': unsafe use of type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(90,1): warning C4804: '<=': unsafe use of type 'bool' in operation
);
^
core/conversion/evaluators/aten.cpp(103,1): warning C4804: '>=': unsafe use of type 'bool' in operation
);
^
INFO: From Compiling core/conversion/converters/impl/shuffle.cpp:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.24.28314\include\xutility(3330,22): warning C4244: '=': conversion from '__int64' to '_Ty', possible loss of data
with
[
_Ty=int32_t
]
_Dest = _First;
^
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.24.28314\include\xutility(3367): note: see reference to function template instantiation '_OutIt std::_Copy_unchecked<__int64,_Ty>(_InIt,_InIt,_OutIt)' being compiled
with
[
_OutIt=int32_t ,
_Ty=int32_t,
_InIt=__int64 *
]
_Seek_wrapped(_Dest, _Copy_unchecked(_UFirst, _ULast, _UDest));
core/conversion/converters/impl/shuffle.cpp(86): note: see reference to function template instantiation '_OutIt std::copy<std::_Vector_iterator<std::_Vector_val<std::_Simple_types<_Ty>>>,int32_t>(_InIt,_InIt,_OutIt)' being compiled
with
[
_OutIt=int32_t *,
_Ty=int64_t,
_InIt=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int64_t>>>
]
std::copy(new_order.begin(), new_order.end(), permute.order);
C:\users\User_bazel_User\zs4npqzu\execroot\TRTorch\bazel-out\x64_windows-fastbuild\bin\external\libtorch_virtual_includes\c10_cuda\c10/core/MemoryFormat.h(56): note: see reference to class template instantiation 'c10::ArrayRef<int64_t>' being compiled
inline std::vector<int64_t> get_channels_last_strides_2d(IntArrayRef sizes) {
INFO: From Compiling core/conversion/converters/impl/plugins/interpolate_plugin.cpp:
cl : Command line warning D9002 : ignoring unknown option '-pthread'
ERROR: C:/users/User/downloads/trtorch-0.1.0/cpp/api/lib/BUILD:13:1: Linking of rule '//cpp/api/lib:trtorch.dll' failed (Exit 1120)
LINK : warning LNK4044: unrecognized option '/lpthread'; ignored
LINK : warning LNK4044: unrecognized option '/Wl,-rpath,lib/'; ignored
Creating library bazel-out/x64_windows-fastbuild/bin/cpp/api/lib/trtorch.dll.if.lib and object bazel-out/x64_windows-fastbuild/bin/cpp/api/lib/trtorch.dll.if.exp
plugins.lo.lib(interpolate_plugin.obj) : error LNK2019: unresolved external symbol getPluginRegistry referenced in function "public: __cdecl nvinfer1::PluginRegistrar<class trtorch::core::conversion::converters::impl::plugins::InterpolatePluginCreator>::PluginRegistrar<class trtorch::core::conversion::converters::impl::plugins::InterpolatePluginCreator>(void)" (??0?$PluginRegistrar@VInterpolatePluginCreator@plugins@impl@converters@conversion@core@trtorch@@@nvinfer1@@QEAA@XZ)
runtime.lo.lib(TRTEngine.obj) : error LNK2019: unresolved external symbol createInferRuntime_INTERNAL referenced in function "class nvinfer1::IRuntime * __cdecl nvinfer1::anonymous namespace'::createInferRuntime(class nvinfer1::ILogger &)" (?createInferRuntime@?A0x4cf47092@nvinfer1@@YAPEAVIRuntime@2@AEAVILogger@2@@Z) conversionctx.lib(ConversionCtx.obj) : error LNK2019: unresolved external symbol createInferBuilder_INTERNAL referenced in function "class nvinfer1::IBuilder * __cdecl nvinfer1::
anonymous namespace'::createInferBuilder(class nvinfer1::ILogger &)" (?createInferBuilder@?A0xa232b0d4@nvinfer1@@YAPEAVIBuilder@2@AEAVILogger@2@@Z)
bazel-out\x64_windows-fastbuild\bin\cpp\api\lib\trtorch.dll : fatal error LNK1120: 3 unresolved externals
Target //:libtrtorch failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 77.200s, Critical Path: 21.14s
INFO: 86 processes: 86 local.
FAILED: Build did NOT complete successfully
Please advise.
It seems like it is missing TensorRT symbols in the linking. Do you know if you are using the TRT DLLs or trying to statically link?
Hello @narendasan, How can I know if the build process looking for the TensoRT DLLs or Libs?
What I did is to change the file \third_party\tensorrt\local\BUILD according to the PR https://github.com/NVIDIA/TRTorch/pull/190 as following: Was: cc_import( name = "nvinfer_lib", shared_library = select({ ":aarch64_linux": "lib/aarch64-linux-gnu/libnvinfer.so", ":windows": "lib/nvinfer.dll", "//conditions:default": "lib/x86_64-linux-gnu/libnvinfer.so", }), static_library = select({ ":aarch64_linux": "lib/aarch64-linux-gnu/libnvinfer_static.a", ":windows": "lib/nvinfer.lib", "//conditions:default": "lib/x86_64-linux-gnu/libnvinfer_static.a" }), visibility = ["//visibility:private"], )
Is: cc_import( name = "nvinfer_static_lib", static_library = select({ ":aarch64_linux": "lib/aarch64-linux-gnu/libnvinfer_static.a", ":windows": "lib/nvinfer.lib", "//conditions:default": "lib/x86_64-linux-gnu/libnvinfer_static.a", }), visibility = ["//visibility:private"], )
cc_import( name = "nvinfer_lib", shared_library = select({ ":aarch64_linux": "lib/aarch64-linux-gnu/libnvinfer.so", ":windows": "lib/nvinfer.dll", "//conditions:default": "lib/x86_64-linux-gnu/libnvinfer.so", }), visibility = ["//visibility:private"], )
And also changed the WORKSPACE file as following: Was: new_local_repository( name = "tensorrt", path = "/usr/", build_file = "@//third_party/tensorrt/local:BUILD" )
Is: new_local_repository( name = "tensorrt", path = "c:/TensorRT-7.2.1.6", build_file = "@//third_party/tensorrt/local:BUILD" )
Which c:/TensorRT-7.2.1.6 is the path that include all TensorRT libraries as they are downloaded from NVIDIA site as is without any change by me which means that it include the lib and include directories. TensorRT bin directory contains the DLLs and libs files.
Did I do something wrong? Shall I edit these files differently? Shall I edit also any other BUILD file?
Thanks,
This should be fine to make bazel use the DLLs. Im not sure what is going on, its complaining about missing symbols in TensorRT's libraries
Hi, Can you check please and approve that my changes are correct? Maybe I edited wrongly or maybe missed something? Maybe it was a wrong decision to take the latest 7.2.1.6 version?
I think I'm very close to successfully create a TRTorch version for Windows, it isn't make sense that if it works well for Linux x64 it cannot work also well for Windows x64.....
Please advise,
Thanks,
Hello, Can I provide you the entire package that I prepared? Maybe you will be able to use it and successfully complete the build process on your machine and then understand what is missing in my machine?
Thanks,
Something to try is to see if using master helps, we recently separated out static and dynamic deps. It should ensure that you are using DLLs on windows. Might make it more clear what is happening
Hello @narendasan,
Thanks for your response.
I downloaded the master version yesterday. Based on the knowledge I achieved while working with my Jetson Xavier AGX and getting started guide instructions https://nvidia.github.io/TRTorch/tutorials/installation.html, I edited all required files such as WORKSPACE. (Based on my understanding....)
My setup now is: Windows x64, TRTorch - master, CUDA 11.0, CuDNN 8.0, TensorRT 7.2.2.3 PyTorch 1.7.1 Torchvision 0.8.2 (Both of them successfully installed via: pip3 install torch===1.7.1+cu110 torchvision===0.8.2+cu110 torchaudio===0.7.2 -f https://download.pytorch.org/whl/torch_stable.html)
Activate this command: bazel build //:libtrtorch -c opt
Now I'm getting a compilation error (didn't get to linker phase as described in previous comments above)
_INFO: Analyzed target //:libtrtorch (36 packages loaded, 2133 targets configured). INFO: Found 1 target... INFO: Deleting stale sandbox base C:/users/XXX/bazel_XXX/vctgn25s/sandbox ERROR: C:/aag/hpc/services/dleware/source/applications/trtorchtester/data/build/windows/trtorch-master/core/lowering/BUILD:10:1: C++ compilation of rule '//core/lowering:lowering' failed (Exit 2) core/lowering/lowering.cpp(11,10): fatal error C1083: Cannot open include file: 'torch/csrc/jit/passes/remove_mutation.h': No such file or directory #include "torch/csrc/jit/passes/remove_mutation.h" ^ Target //:libtrtorch failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 21.234s, Critical Path: 14.59s INFO: 8 processes: 8 local. FAILED: Build did NOT complete successfully
I checked and found the reported missing file in my torch installation directory so I know that the problem root cause is maybe a wrong configuration where my torch is installed.
In order to notify TRTorch where my torch is installed I made the following changes inside WORKSPACE file: Was: new_local_repository( name = "libtorch", path = "/usr/local/lib/python3.6/dist-packages/torch", build_file = "third_party/libtorch/BUILD" )
new_local_repository ( name = "libtorch_pre_cxx11_abi", path = "/usr/local/lib/python3.6/dist-packages/torch", build_file = "third_party/libtorch/BUILD" ) Is: new_local_repository( name = "libtorch", path = "C:/Users/XXX/Tensorflow23/Lib/site-packages/torch", build_file = "third_party/libtorch/BUILD" )
new_local_repository( name = "libtorch_pre_cxx11_abi", path = "C:/Users/XXX/Tensorflow23/Lib/site-packages/torch", build_file = "third_party/libtorch/BUILD" )
Is it OK? Shall I change something else? Shall I change another file also?
Thanks,
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
Did you find a solution to this problem?
Unfortunately no, Actually for now I left this issue with hope that when I will need it again it will already solved. I prefer to find out that there will be a pre built binaries for Windows platform as it provided for x64 Linux. Or to find that anyone solved the build from sources path for Windows and learn from its process. Good luck
this https://github.com/NVIDIA/Torch-TensorRT/issues/690 could help
Wondering if you've resolved this yet? I am having the same issue, trying to use Torch-TensorRT on Windows and having trouble building it myself.
I am going to close this issue because we have a community maintained cmake build system to build the libraries. Open a new issue if it is still a problem.