CUDA reports errors during multithreaded inference (多线程推理时CUDA报错)

Open youtianhong opened this issue 1 year ago • 0 comments

Currently we will use scenarios where multiple threads are inferencing at the same time, is there any solution for this, CUDA inference logic is not thread safe? 请问有什么解决办法吗？CUDA推理逻辑不是线程安全的？

Error is below: 报错信息如下 File "/tt/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tt/lib/python3.11/site-packages/torch/nn/modules/activation.py", line 396, in forward return F.silu(input, inplace=self.inplace) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tt/lib/python3.11/site-packages/torch/nn/functional.py", line 2058, in silu return torch._C.nn.silu(input) ^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: CUDA error: misaligned address CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

之前还出现过如下错误： While testing several times, another problem was discovered at the same time RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING = 1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions

Jul 18 '24 01:07 youtianhong