PaddleSeg icon indicating copy to clipboard operation
PaddleSeg copied to clipboard

ExternalError: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED.

Open dextroushands opened this issue 4 months ago • 1 comments

问题确认 Search before asking

Bug描述 Describe the Bug

使用infer.py进行推理示例的时候报错 python deploy/python/infer.py --config ./pp_liteseg_infer_model/deploy.yaml --image_path ./1.jpg

报错信息:[2025/06/04 15:52:41] INFO: Use GPU --- Running analysis [ir_graph_build_pass] I0604 15:52:41.503283 1032891 executor.cc:187] Old Executor is Running. --- Running analysis [ir_analysis_pass] --- Running IR pass [map_op_to_another_pass] I0604 15:52:41.533732 1032891 fuse_pass_base.cc:59] --- detected 3 subgraphs --- Running IR pass [identity_scale_op_clean_pass] --- Running IR pass [is_test_pass] --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [constant_folding_pass] --- Running IR pass [silu_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0604 15:52:41.575934 1032891 fuse_pass_base.cc:59] --- detected 42 subgraphs --- Running IR pass [conv_eltwiseadd_bn_fuse_pass] I0604 15:52:41.585345 1032891 fuse_pass_base.cc:59] --- detected 4 subgraphs --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [vit_attention_fuse_pass] --- Running IR pass [fused_multi_transformer_encoder_pass] --- Running IR pass [fused_multi_transformer_decoder_pass] --- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass] --- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass] --- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass] --- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass] --- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass] --- Running IR pass [fuse_multi_transformer_layer_pass] --- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass] --- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass] --- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass] --- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass] --- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass] --- Running IR pass [matmul_scale_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [gpu_cpu_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] --- Running IR pass [fc_elementwise_layernorm_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [conv_elementwise_add_fuse_pass] I0604 15:52:41.814822 1032891 fuse_pass_base.cc:59] --- detected 3 subgraphs --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running IR pass [conv2d_fusion_layout_transfer_pass] --- Running IR pass [transfer_layout_elim_pass] --- Running IR pass [auto_mixed_precision_pass] --- Running IR pass [inplace_op_var_pass] --- Running analysis [save_optimized_model_pass] W0604 15:52:41.817167 1032891 save_optimized_model_pass.cc:28] save_optim_cache_model is turned off, skip save_optimized_model_pass --- Running analysis [ir_params_sync_among_devices_pass] I0604 15:52:41.817181 1032891 ir_params_sync_among_devices_pass.cc:51] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0604 15:52:41.836988 1032891 memory_optimize_pass.cc:222] Cluster name : shape_1.tmp_0_slice_0 size: 8 I0604 15:52:41.837002 1032891 memory_optimize_pass.cc:222] Cluster name : shape_0.tmp_0_slice_0 size: 8 I0604 15:52:41.837007 1032891 memory_optimize_pass.cc:222] Cluster name : mean_0.tmp_0 size: 4 I0604 15:52:41.837011 1032891 memory_optimize_pass.cc:222] Cluster name : x size: 12 I0604 15:52:41.837015 1032891 memory_optimize_pass.cc:222] Cluster name : relu_25.tmp_0 size: 512 I0604 15:52:41.837020 1032891 memory_optimize_pass.cc:222] Cluster name : concat_1.tmp_0 size: 1024 I0604 15:52:41.837028 1032891 memory_optimize_pass.cc:222] Cluster name : concat_3.tmp_0 size: 2048 I0604 15:52:41.837031 1032891 memory_optimize_pass.cc:222] Cluster name : concat_5.tmp_0 size: 4096 I0604 15:52:41.837037 1032891 memory_optimize_pass.cc:222] Cluster name : relu_20.tmp_0 size: 512 I0604 15:52:41.837039 1032891 memory_optimize_pass.cc:222] Cluster name : relu_28.tmp_0 size: 8192 I0604 15:52:41.837042 1032891 memory_optimize_pass.cc:222] Cluster name : pool2d_5.tmp_0 size: 65536 --- Running analysis [ir_graph_to_program_pass] I0604 15:52:41.880380 1032891 analysis_predictor.cc:1660] ======= optimize end ======= I0604 15:52:41.880764 1032891 naive_executor.cc:164] --- skip [feed], feed -> x I0604 15:52:41.881990 1032891 naive_executor.cc:164] --- skip [argmax_0.tmp_0], fetch -> fetch W0604 15:52:42.055303 1032891 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.4, Runtime API Version: 11.8 W0604 15:52:42.056638 1032891 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9. Traceback (most recent call last): File "/root/ld/jieyang/PaddleSeg/deploy/python/infer.py", line 390, in main(args) File "/root/ld/jieyang/PaddleSeg/deploy/python/infer.py", line 378, in main predictor.run(imgs_list) File "/root/ld/jieyang/PaddleSeg/deploy/python/infer.py", line 238, in run self.predictor.run() OSError: In user code:

File "/ssd2/pengjuncai/PaddleSeg/export.py", line 143, in <module>
  main(args)
File "/ssd2/pengjuncai/PaddleSeg/export.py", line 115, in main
  paddle.jit.save(new_net, save_path)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/jit.py", line 631, in wrapper
  func(layer, path, input_spec, **configs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/decorator.py", line 232, in fun
  return caller(func, *(extras + args), **kw)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
  return wrapped_func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/base.py", line 51, in __impl__
  return func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/jit.py", line 860, in save
  concrete_program = static_func.concrete_program_specify_input_spec(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 527, in concrete_program_specify_input_spec
  concrete_program, _ = self.get_concrete_program(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 436, in get_concrete_program
  concrete_program, partial_program_layer = self._program_cache[cache_key]
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 801, in __getitem__
  self._caches[item_id] = self._build_once(item)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 785, in _build_once
  concrete_program = ConcreteProgram.from_func_spec(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/decorator.py", line 232, in fun
  return caller(func, *(extras + args), **kw)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
  return wrapped_func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/base.py", line 51, in __impl__
  return func(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 733, in from_func_spec
  outputs = static_func(*inputs)
File "/ssd2/pengjuncai/PaddleSeg/export.py", line 68, in forward
  outs = self.net(x)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/pp_liteseg.py", line 114, in forward
  feats_head = self.ppseg_head(feats_selected)  # [..., x8, x16, x32]
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/pp_liteseg.py", line 191, in forward
  high_feat = arm(low_feat, high_feat)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/layers/tensor_fusion.py", line 76, in forward
  out = self.fuse(x, y)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/layers/tensor_fusion.py", line 188, in fuse
  atten = F.sigmoid(self.conv_xy_atten(atten))
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/container.py", line 98, in forward
  input = layer(input)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/PaddleSeg/paddleseg/models/layers/layer_libs.py", line 109, in forward
  x = self._conv(x)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
  return self._dygraph_call_func(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
  outputs = self.forward(*inputs, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/nn/layer/conv.py", line 666, in forward
  out = F.conv._conv_nd(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/nn/functional/conv.py", line 168, in _conv_nd
  helper.append_op(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/layer_helper.py", line 44, in append_op
  return self.main_program.current_block().append_op(*args, **kwargs)
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/framework.py", line 3615, in append_op
  op = Operator(
File "/ssd2/pengjuncai/anaconda3/lib/python3.9/site-packages/paddle/fluid/framework.py", line 2635, in __init__
  for frame in traceback.extract_stack():

ExternalError: CUDNN error(9), CUDNN_STATUS_NOT_SUPPORTED.
  [Hint: 'CUDNN_STATUS_NOT_SUPPORTED'.  The functionality requested is not presently supported by cuDNN.  ] (at ../paddle/phi/kernels/fusion/gpu/conv_fusion_kernel.cu:616)
  [operator < conv2d_fusion > error]

复现环境 Environment

Ubuntu20.04 paddle:2.5.0-post11.8 cuda:11.8 cudnn:8.9.7 cuda和cudnn版本是一致的

Bug描述确认 Bug description confirmation

  • [x] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

是否愿意提交PR? Are you willing to submit a PR?

  • [ ] 我愿意提交PR!I'd like to help by submitting a PR!

dextroushands avatar Jun 04 '25 07:06 dextroushands