onnxruntime
onnxruntime copied to clipboard
Random failures of a TensorRT test in Windows GPU TensorRT CI
Describe the issue
A test, TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference, frequently fails in Windows GPU TensorRT CI. From its title, it looks like a multi-threading test, so I feel there might be a bug. Could someone familiar with TensorRT take a look? Below I list two different errors generated by this test.
2022-09-20T05:14:17.1291863Z 1: [ RUN ] TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference
2022-09-20T05:14:21.9092384Z 1: 2022-09-20 05:14:21.9100710 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 onnxruntime::TensorrtLogger::log] [2022-09-20 05:14:21 WARNING] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
2022-09-20T05:14:21.9102071Z 1: 2022-09-20 05:14:21.9112506 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 onnxruntime::TensorrtLogger::log] [2022-09-20 05:14:21 WARNING] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
2022-09-20T05:14:21.9391947Z 1: 2022-09-20 05:14:21.9401565 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 onnxruntime::TensorrtLogger::log] [2022-09-20 05:14:21 ERROR] 1: [stdArchiveReader.cpp::nvinfer1::rt::StdArchiveReader::StdArchiveReader::54] Error Code 1: Serialization (Serialization assertion sizeRead == static_cast<uint64_t>(mEnd - mCurrent) failed.Size specified in header does not match archive size)
2022-09-20T05:14:21.9396196Z 1: 2022-09-20 05:14:21.9407796 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 onnxruntime::TensorrtLogger::log] [2022-09-20 05:14:21 ERROR] 4: [runtime.cpp::nvinfer1::Runtime::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
2022-09-20T05:14:21.9398995Z 1: D:\a\_work\1\s\onnxruntime\test\providers\tensorrt\tensorrt_basic_test.cc(161): error: Value of: status.IsOK()
2022-09-20T05:14:21.9399681Z 1: Actual: false
2022-09-20T05:14:21.9400073Z 1: Expected: true
2022-09-20T05:14:21.9412174Z 1: [ FAILED ] TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference (4812 ms)
2022-09-19T19:08:39.2238214Z 1: [ RUN ] TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference
2022-09-19T19:08:44.6950897Z 1: 2022-09-19 19:08:44.6947546 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 onnxruntime::TensorrtLogger::log] [2022-09-19 19:08:44 WARNING] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
2022-09-19T19:08:44.6962348Z 1: 2022-09-19 19:08:44.6959960 [W:onnxruntime:Default, tensorrt_execution_provider.h:60 onnxruntime::TensorrtLogger::log] [2022-09-19 19:08:44 WARNING] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
2022-09-19T19:08:44.6970013Z 1: 2022-09-19 19:08:44.6968018 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 onnxruntime::TensorrtLogger::log] [2022-09-19 19:08:44 ERROR] 3: Cannot deserialize with an empty memory buffer.
2022-09-19T19:08:44.6972776Z 1: 2022-09-19 19:08:44.6968040 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 onnxruntime::TensorrtLogger::log] [2022-09-19 19:08:44 ERROR] 3: Cannot deserialize with an empty memory buffer.
2022-09-19T19:08:44.7227686Z 1: 2022-09-19 19:08:44.7225590 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 onnxruntime::TensorrtLogger::log] [2022-09-19 19:08:44 ERROR] 4: [runtime.cpp::nvinfer1::Runtime::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
2022-09-19T19:08:44.7229367Z 1: 2022-09-19 19:08:44.7225599 [E:onnxruntime:Default, tensorrt_execution_provider.h:58 onnxruntime::TensorrtLogger::lD:\a\_work\1\s\onnxruntime\test\providers\tensorrt\tensorrt_basic_test.cc(161): error: Value of: status.IsOK()
2022-09-19T19:08:44.7230326Z 1: Actual: false
2022-09-19T19:08:44.7230731Z 1: Expected: true
2022-09-19T19:08:44.7231412Z 1: og] [2022-09-19 19:08:44 ERROR] 4: [runtime.cpp::nvinfer1::Runtime::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
2022-09-19T19:08:44.7236549Z 1: D:\a\_work\1\s\onnxruntime\test\providers\tensorrt\tensorrt_basic_test.cc(161): error: Value of: status.IsOK()
2022-09-19T19:08:44.7237280Z 1: Actual: false
2022-09-19T19:08:44.7237699Z 1: Expected: true
2022-09-19T19:08:44.7249044Z 1: [ FAILED ] TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference (5501 ms)
To reproduce
The reproducibility is not stable but you can find various of error messages in Windows GPU TensorRT CI for this single test.
Urgency
Not urgent. It's just a CI test.
Platform
Windows
OS Version
Microsoft Windows Server 2019
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
Just use master branch should be enough.
ONNX Runtime API
C++
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
I don't know. It's our CI machine.