RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
System Info / 系統信息
Driver Version: 550.127.05 CUDA Version: 12.4
Traceback (most recent call last):
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/applications.py", line 112, in call
await self.middleware_stack(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in call
raise exc
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in call
await self.app(scope, receive, _send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 715, in call
await self.middleware_stack(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
await route.handle(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
await self.app(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
response = await f(request)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
File "/home/work/data/CogAgent/app/openai_demo.py", line 188, in create_chat_completion
response = generate_cogagent(model, tokenizer, gen_params)
File "/home/work/data/CogAgent/app/openai_demo.py", line 249, in generate_cogagent
for response in generate_stream_cogagent(model, tokenizer, params):
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
response = gen.send(None)
File "/home/work/data/CogAgent/app/openai_demo.py", line 322, in generate_stream_cogagent
).to(model.device)
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 820, in to
self.data = {
File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 821, in TORCH_USE_CUDA_DSA to enable device-side assertions.
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
- [ ] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
Run app,client task: open browser and visit Baidu homepage Server error
Expected behavior / 期待表现
Can call the server API correctly
Do not use FP16 inference, please use BF16 inference.
Do not use FP16 inference, please use BF16 inference.
I have used float16.
Is it not possible to run on V100?
If running in FP16, you may encounter the error mentioned above. Although it's not a likely event, it seems unavoidable at the moment.
@zRzRzRzRzRzRzR 那V100上,INT8能跑吗?