Leo Dong

Results 22 comments of Leo Dong

Just to clarify that the latent codes are exactly the same in both cases.

Closed as duplicate of: https://github.com/NVIDIA/TensorRT-Model-Optimizer/issues/37

@AmazDeng I think you didn't show how you hook the `TensorRTModel` to your Python multiprocessing pool. The issue is that a single TensorRT engine is not supposed to be held...

> @LeoZDong > > Does TensorRT support multithreaded inference? Note that it's multithreading, not multiprocessing? Yes. Look at the execute_async_v3 method (https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/ExecutionContext.html#tensorrt.IExecutionContext.execute_async_v3). You will need to pass in a CUDA...

Seems like your converted ONNX graph has some issue in a reshape node. Could you attach the ONNX model file?

@xiaowuhu Put up a fix here (#310). Mind assigning some reviewers when you get a chance? Thanks!