PaddleSeg
PaddleSeg copied to clipboard
ExternalError: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED.
问题确认 Search before asking
- [x] 我已经查询历史issue(包括open与closed),没有发现相似的bug。I have searched the open and closed issues and found no similar bug report.
Bug描述 Describe the Bug
使用infer.py进行推理示例的时候报错 python deploy/python/infer.py --config ./pp_liteseg_infer_model/deploy.yaml --image_path ./1.jpg
报错信息:[2025/06/04 15:52:41] INFO: Use GPU
--- Running analysis [ir_graph_build_pass]
I0604 15:52:41.503283 1032891 executor.cc:187] Old Executor is Running.
--- Running analysis [ir_analysis_pass]
--- Running IR pass [map_op_to_another_pass]
I0604 15:52:41.533732 1032891 fuse_pass_base.cc:59] --- detected 3 subgraphs
--- Running IR pass [identity_scale_op_clean_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [constant_folding_pass]
--- Running IR pass [silu_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0604 15:52:41.575934 1032891 fuse_pass_base.cc:59] --- detected 42 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
I0604 15:52:41.585345 1032891 fuse_pass_base.cc:59] --- detected 4 subgraphs
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [vit_attention_fuse_pass]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
--- Running IR pass [matmul_scale_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0604 15:52:41.814822 1032891 fuse_pass_base.cc:59] --- detected 3 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [conv2d_fusion_layout_transfer_pass]
--- Running IR pass [transfer_layout_elim_pass]
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [inplace_op_var_pass]
--- Running analysis [save_optimized_model_pass]
W0604 15:52:41.817167 1032891 save_optimized_model_pass.cc:28] save_optim_cache_model is turned off, skip save_optimized_model_pass
--- Running analysis [ir_params_sync_among_devices_pass]
I0604 15:52:41.817181 1032891 ir_params_sync_among_devices_pass.cc:51] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0604 15:52:41.836988 1032891 memory_optimize_pass.cc:222] Cluster name : shape_1.tmp_0_slice_0 size: 8
I0604 15:52:41.837002 1032891 memory_optimize_pass.cc:222] Cluster name : shape_0.tmp_0_slice_0 size: 8
I0604 15:52:41.837007 1032891 memory_optimize_pass.cc:222] Cluster name : mean_0.tmp_0 size: 4
I0604 15:52:41.837011 1032891 memory_optimize_pass.cc:222] Cluster name : x size: 12
I0604 15:52:41.837015 1032891 memory_optimize_pass.cc:222] Cluster name : relu_25.tmp_0 size: 512
I0604 15:52:41.837020 1032891 memory_optimize_pass.cc:222] Cluster name : concat_1.tmp_0 size: 1024
I0604 15:52:41.837028 1032891 memory_optimize_pass.cc:222] Cluster name : concat_3.tmp_0 size: 2048
I0604 15:52:41.837031 1032891 memory_optimize_pass.cc:222] Cluster name : concat_5.tmp_0 size: 4096
I0604 15:52:41.837037 1032891 memory_optimize_pass.cc:222] Cluster name : relu_20.tmp_0 size: 512
I0604 15:52:41.837039 1032891 memory_optimize_pass.cc:222] Cluster name : relu_28.tmp_0 size: 8192
I0604 15:52:41.837042 1032891 memory_optimize_pass.cc:222] Cluster name : pool2d_5.tmp_0 size: 65536
--- Running analysis [ir_graph_to_program_pass]
I0604 15:52:41.880380 1032891 analysis_predictor.cc:1660] ======= optimize end =======
I0604 15:52:41.880764 1032891 naive_executor.cc:164] --- skip [feed], feed -> x
I0604 15:52:41.881990 1032891 naive_executor.cc:164] --- skip [argmax_0.tmp_0], fetch -> fetch
W0604 15:52:42.055303 1032891 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.4, Runtime API Version: 11.8
W0604 15:52:42.056638 1032891 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9.
Traceback (most recent call last):
File "/root/ld/jieyang/PaddleSeg/deploy/python/infer.py", line 390, in
File "/ssd2/pengjuncai/PaddleSeg/export.py", line 143, in <module>
main(args)
File "/ssd2/pengjuncai/PaddleSeg/export.py", line 115, in main
paddle.jit.save(new_net, save_path)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/jit.py", line 631, in wrapper
func(layer, path, input_spec, **configs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/base.py", line 51, in __impl__
return func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/jit.py", line 860, in save
concrete_program = static_func.concrete_program_specify_input_spec(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 527, in concrete_program_specify_input_spec
concrete_program, _ = self.get_concrete_program(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 436, in get_concrete_program
concrete_program, partial_program_layer = self._program_cache[cache_key]
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 801, in __getitem__
self._caches[item_id] = self._build_once(item)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 785, in _build_once
concrete_program = ConcreteProgram.from_func_spec(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/base.py", line 51, in __impl__
return func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 733, in from_func_spec
outputs = static_func(*inputs)
File "/ssd2/pengjuncai/PaddleSeg/export.py", line 68, in forward
outs = self.net(x)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/pp_liteseg.py", line 114, in forward
feats_head = self.ppseg_head(feats_selected) # [..., x8, x16, x32]
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/pp_liteseg.py", line 191, in forward
high_feat = arm(low_feat, high_feat)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/layers/tensor_fusion.py", line 76, in forward
out = self.fuse(x, y)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/layers/tensor_fusion.py", line 188, in fuse
atten = F.sigmoid(self.conv_xy_atten(atten))
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/container.py", line 98, in forward
input = layer(input)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/layers/layer_libs.py", line 109, in forward
x = self._conv(x)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/nn/layer/conv.py", line 666, in forward
out = F.conv._conv_nd(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/nn/functional/conv.py", line 168, in _conv_nd
helper.append_op(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/layer_helper.py", line 44, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/framework.py", line 3615, in append_op
op = Operator(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/framework.py", line 2635, in __init__
for frame in traceback.extract_stack():
ExternalError: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED.
[Hint: 'CUDNN_STATUS_NOT_SUPPORTED'. The functionality requested is not presently supported by cuDNN. ] (at ../paddle/phi/kernels/fusion/gpu/conv_fusion_kernel.cu:616)
[operator < conv2d_fusion > error]
复现环境 Environment
Ubuntu20.04 paddle:2.5.0-post11.8 cuda:11.8 cudnn:8.9.7 cuda和cudnn版本是一致的
Bug描述确认 Bug description confirmation
- [x] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
是否愿意提交PR? Are you willing to submit a PR?
- [ ] 我愿意提交PR!I'd like to help by submitting a PR!