tensorrt_inference Using cudaMemcpyAsync directly rather than context->enqueueV2

Using cudaMemcpyAsync directly rather than context->enqueueV2

Open samehmohamed88 opened this issue 2 years ago • 0 comments

Many thanks for this great this repo! It's an amazing and useful work, that I am learning from quite a bit.

I wanted to check if there's an optimization reason why you chose to call cudaMemcpyAsync directly in mode.cpp rather than context->enqueueV2 as written in the documentation .

I am still relatively new to deploying models in C++. Was your choice an optimization choice? Or a just a personal coding style?

If I understand correctly, enqueueV2 is just a wrapper around the cuda memcpy, and so wouldn't using the enqueueV2 or executeV2 be more maintainable long term, since as the tensor-rt changes they could potentially change the implementation but keep that method same signature.

Aug 24 '22 02:08 samehmohamed88

tensorrt_inference tensorrt_inference copied to clipboard

Using cudaMemcpyAsync directly rather than context->enqueueV2

tensorrt_inference
tensorrt_inference copied to clipboard