CogAgent icon indicating copy to clipboard operation
CogAgent copied to clipboard

RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Open coding-alt opened this issue 10 months ago • 5 comments

System Info / 系統信息

Driver Version: 550.127.05 CUDA Version: 12.4

Traceback (most recent call last): File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi result = await app( # type: ignore[func-returns-value] File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call return await self.app(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/applications.py", line 112, in call await self.middleware_stack(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in call raise exc File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in call await self.app(scope, receive, _send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call await self.app(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app raise exc File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 715, in call await self.middleware_stack(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 735, in app await route.handle(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle await self.app(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 76, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app raise exc File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/starlette/routing.py", line 73, in app response = await f(request) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app raw_response = await run_endpoint_function( File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/fastapi/routing.py", line 212, in run_endpoint_function return await dependant.call(**values) File "/home/work/data/CogAgent/app/openai_demo.py", line 188, in create_chat_completion response = generate_cogagent(model, tokenizer, gen_params) File "/home/work/data/CogAgent/app/openai_demo.py", line 249, in generate_cogagent for response in generate_stream_cogagent(model, tokenizer, params): File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 36, in generator_context response = gen.send(None) File "/home/work/data/CogAgent/app/openai_demo.py", line 322, in generate_stream_cogagent ).to(model.device) File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 820, in to self.data = { File "/data/miniconda3/envs/CogAgent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 821, in k: v.to(device=device, non_blocking=non_blocking) if isinstance(v, torch.Tensor) else v RuntimeError: CUDA error: device-side assert triggered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

  • [ ] The official example scripts / 官方的示例脚本
  • [ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

Run app,client task: open browser and visit Baidu homepage Server error

Expected behavior / 期待表现

Can call the server API correctly

coding-alt avatar Jan 26 '25 06:01 coding-alt

Do not use FP16 inference, please use BF16 inference.

zRzRzRzRzRzRzR avatar Jan 26 '25 07:01 zRzRzRzRzRzRzR

Do not use FP16 inference, please use BF16 inference.

I have used float16.

coding-alt avatar Jan 27 '25 09:01 coding-alt

Is it not possible to run on V100?

coding-alt avatar Feb 07 '25 03:02 coding-alt

If running in FP16, you may encounter the error mentioned above. Although it's not a likely event, it seems unavoidable at the moment.

zRzRzRzRzRzRzR avatar Feb 08 '25 08:02 zRzRzRzRzRzRzR

@zRzRzRzRzRzRzR 那V100上,INT8能跑吗?

HongyuJiang avatar Jun 17 '25 13:06 HongyuJiang