X2Paddle icon indicating copy to clipboard operation
X2Paddle copied to clipboard

[Debug][PaddleV3] 测试 inference 模型导入卡住的问题

Open megemini opened this issue 1 year ago • 6 comments

[Debug][PaddleV3] 测试 inference 模型导入卡住的问题

关联:https://github.com/PaddlePaddle/X2Paddle/pull/1064

megemini avatar Oct 18 '24 12:10 megemini

@vivienfanghuagood

根据 https://github.com/PaddlePaddle/X2Paddle/pull/1064 中的方法,日志已经打开了 ~ 效率云貌似有点问题,可以看一下这里:

https://xly.bce.baidu.com/paddlepaddle/x2paddle-ci/newipipe/detail/11736915/job/27809560


2024-10-21 12:13:00 [2024-10-21 04:13:00,355] [    INFO] convert.py:303 - Now translating model from onnx to paddle.
2024-10-21 12:13:00 Converting node 1 ...     
2024-10-21 12:13:00 Converting node 2 ...     [2024-10-21 04:13:00,361] [    INFO] convert.py:325 - Model optimizing ...
2024-10-21 12:13:00 [2024-10-21 04:13:00,370] [    INFO] convert.py:329 - Model optimized.
2024-10-21 12:13:00 /usr/local/lib/python3.9/dist-packages/paddle/framework/io.py:939: UserWarning: The input state dict is empty, no need to save.
2024-10-21 12:13:00   warnings.warn("The input state dict is empty, no need to save.")
2024-10-21 12:13:00 /usr/local/lib/python3.9/dist-packages/paddle/jit/dy2static/program_translator.py:699: UserWarning: full_graph=False don't support input_spec arguments. It will not produce any effect.
2024-10-21 12:13:00 You can set full_graph=True, then you can assign input spec.
2024-10-21 12:13:00   warnings.warn(
2024-10-21 12:13:00 I1021 04:13:00.460772    76 op_desc.cc:1112] CompileTime infer shape on abs
2024-10-21 12:13:00 I1021 04:13:00.460808    76 infershape_utils.cc:546] *******: op kernel signature - Kernel Signature - name: abs; inputs: X; attributes: ; outputs: Out
2024-10-21 12:13:00 I1021 04:13:00.463881    76 eager.cc:118] Tensor(cuda_graph) have not GradNode, add ******* for it.
2024-10-21 12:13:00 I1021 04:13:00.479213    76 op_desc.cc:1112] CompileTime infer shape on scale
2024-10-21 12:13:00 I1021 04:13:00.479271    76 infershape_utils.cc:546] *******: op kernel signature - Kernel Signature - name: scale; inputs: X; attributes: scale, bias, bias_after_scale; outputs: Out
2024-10-21 12:13:00 /usr/local/lib/python3.9/dist-packages/paddle/static/io.py:581: UserWarning: no variable in your model, please ensure there are any variables in your model to save
2024-10-21 12:13:00   warnings.warn(
2024-10-21 12:13:00 I1021 04:13:00.514019    76 onednn_context.cc:104] Clearing DNNL cache.
2024-10-21 12:13:00 I1021 04:13:00.514046    76 onednn_context.cc:122] Resetting Paddle data layout to NCHW.
2024-10-21 12:13:00 [2024-10-21 04:13:00,515] [    INFO] convert.py:331 - Successfully exported Paddle static graph model!
2024-10-21 12:13:00 [2024-10-21 04:13:00,515] [    INFO] convert.py:348 - ================================================
2024-10-21 12:13:00 [2024-10-21 04:13:00,516] [    INFO] convert.py:349 - 
2024-10-21 12:13:00 [2024-10-21 04:13:00,516] [    INFO] convert.py:350 - Model Converted! Fill this survey to help X2Paddle better, https://iwenjuan.baidu.com/?code=npyd51 
2024-10-21 12:13:00 [2024-10-21 04:13:00,516] [    INFO] convert.py:353 - 
2024-10-21 12:13:00 [2024-10-21 04:13:00,517] [    INFO] convert.py:354 - ================================================
2024-10-21 12:13:00 [2024-10-21 04:13:00,517] [    INFO] onnxbase.py:207 - >>> onnx2paddle finished ...
2024-10-21 12:13:00 [2024-10-21 04:13:00,517] [    INFO] onnxbase.py:333 - >>> _mk_onnx_res *******...
2024-10-21 12:13:00 [2024-10-21 04:13:00,520] [    INFO] onnxbase.py:339 - >>> sess.run ...
2024-10-21 12:13:00 [2024-10-21 04:13:00,525] [    INFO] onnxbase.py:213 - >>> _mk_paddle_res ...
2024-10-21 12:13:00 I1021 04:13:00.526403    76 eager.cc:118] Tensor(generated_tensor_1) have not GradNode, add ******* for it.
2024-10-21 12:13:00 [2024-10-21 04:13:00,527] [    INFO] onnxbase.py:247 - >>> NOT self.run_dynamic...
2024-10-21 12:13:00 [2024-10-21 04:13:00,528] [    INFO] onnxbase.py:260 - >>> config.enable_use_gpu...
2024-10-21 12:13:00 [2024-10-21 04:13:00,528] [    INFO] onnxbase.py:265 - >>> config.enable_use_gpu finished...
2024-10-21 12:13:00 [2024-10-21 04:13:00,528] [    INFO] onnxbase.py:271 - >>> enable_memory_optim finished...
2024-10-21 12:13:00 [2024-10-21 04:13:00,529] [    INFO] onnxbase.py:276 - >>> config.disable_glog_info...
2024-10-21 12:13:00 [2024-10-21 04:13:00,529] [    INFO] onnxbase.py:281 - >>> config.pass_builder...
2024-10-21 12:13:00 [2024-10-21 04:13:00,530] [    INFO] onnxbase.py:285 - >>> create_predictor(config)...

日志到这里就卡住了,后面也没有 glog ~ 还请帮忙看一下 ~ 🙏🙏🙏

megemini avatar Oct 21 '24 04:10 megemini

看起来卡在了OneDNN转换,你试试config.DisableMKLDNN(),再运行日志看看。 另外 @zhanglirong1999 看看是否有建议呢。

vivienfanghuagood avatar Oct 21 '24 09:10 vivienfanghuagood

看起来卡在了OneDNN转换,你试试config.DisableMKLDNN(),再运行日志看看。 另外 @zhanglirong1999 看看是否有建议呢。

貌似不行 ~ 提示木有这个属性:


2024-10-21 17:36:17 E
2024-10-21 17:36:17 ======================================================================
2024-10-21 17:36:17 ERROR: test (__main__.TestAbsConvert)
2024-10-21 17:36:17 ----------------------------------------------------------------------
2024-10-21 17:36:17 Traceback (most recent call last):
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/test_auto_scan_abs.py", line 62, in test
2024-10-21 17:36:17     self.run_and_statis(max_examples=30)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 105, in run_and_statis
2024-10-21 17:36:17     loop_func()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/usr/local/lib/python3.9/dist-packages/hypothesis/core.py", line 1469, in wrapped_test
2024-10-21 17:36:17     raise the_error_hypothesis_found
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 227, in run_test
2024-10-21 17:36:17     obj.run()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 419, in run
2024-10-21 17:36:17     paddle_res[str(v)] = self._mk_paddle_res(ver=v)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 281, in _mk_paddle_res
2024-10-21 17:36:17     config.DisableMKLDNN()
2024-10-21 17:36:17 AttributeError: 'paddle.base.libpaddle.AnalysisConfig' object has no attribute 'DisableMKLDNN'

megemini avatar Oct 21 '24 10:10 megemini

看起来卡在了OneDNN转换,你试试config.DisableMKLDNN(),再运行日志看看。 另外 @zhanglirong1999 看看是否有建议呢。

貌似不行 ~ 提示木有这个属性:

2024-10-21 17:36:17 E
2024-10-21 17:36:17 ======================================================================
2024-10-21 17:36:17 ERROR: test (__main__.TestAbsConvert)
2024-10-21 17:36:17 ----------------------------------------------------------------------
2024-10-21 17:36:17 Traceback (most recent call last):
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/test_auto_scan_abs.py", line 62, in test
2024-10-21 17:36:17     self.run_and_statis(max_examples=30)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 105, in run_and_statis
2024-10-21 17:36:17     loop_func()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/usr/local/lib/python3.9/dist-packages/hypothesis/core.py", line 1469, in wrapped_test
2024-10-21 17:36:17     raise the_error_hypothesis_found
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 227, in run_test
2024-10-21 17:36:17     obj.run()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 419, in run
2024-10-21 17:36:17     paddle_res[str(v)] = self._mk_paddle_res(ver=v)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 281, in _mk_paddle_res
2024-10-21 17:36:17     config.DisableMKLDNN()
2024-10-21 17:36:17 AttributeError: 'paddle.base.libpaddle.AnalysisConfig' object has no attribute 'DisableMKLDNN'

如果是python的话,用config.disable_mkldnn()

vivienfanghuagood avatar Nov 12 '24 09:11 vivienfanghuagood

这边似乎是走到了onednn_context里面,但是后面没有更确切的信息,暂时没有更多的建议。如果关闭了onednn可以跑过,确认是onednn的问题,后续有需要,onednn会跟进一下。

zhanglirong1999 avatar Nov 12 '24 12:11 zhanglirong1999

这边似乎是走到了onednn_context里面,但是后面没有更确切的信息,暂时没有更多的建议。如果关闭了onednn可以跑过,确认是onednn的问题,后续有需要,onednn会跟进一下。

CI 还是卡在了 create_predictor

megemini avatar Nov 13 '24 09:11 megemini