mace icon indicating copy to clipboard operation
mace copied to clipboard

Failed to transform graph using tf tool: Transform 'remove_control_dependencies' not recognized

Open sheirving opened this issue 5 years ago • 7 comments

Before you open an issue, please make sure you have tried the following steps:

  1. Make sure your environment is the same with (https://mace.readthedocs.io/en/latest/installation/env_requirement.html).
  2. Have you ever read the document for your usage?
  3. Check if your issue appears in HOW-TO-DEBUG or FAQ.
  4. The form below must be filled.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): linux ubuntu 16.04
  • NDK version(e.g., 15c):
  • GCC version(if compiling for host, e.g., 5.4.0):
  • MACE version (Use the command: git describe --long --tags):
  • Python version(3.6):
  • Bazel version (e.g., 0.13.0): 0.17.2

Model deploy file (*.yml)

--- 
library_name: OHW_Eng_Host
target_abis: [host]
model_data_format: file
model_graph_format: file
models: 
  ohw_eng_host: 
    dsp_mode: 0
    limit_opencl_kernel_time: 0
    model_file_path: /home/irving/PycharmProjects/mace/OHW_Eng_Mace/Tf_PB/OHW_Eng_Model_H64_Wnone_Traj_NHWC_8_1_24.2M_v5_top3/model_graph.pb
    model_sha256_checksum: 49df2ba2a1101a7e04d785349a925c9acf634274de9f52a0c5704baa35a3c589
    obfuscate: 1
    platform: tensorflow
    runtime: cpu
    subgraphs: 
      - input_tensors: 
          - input:0
          - img_width:0
        input_shapes: 
          - 64,64,1280,1
          - 64,1
        input_data_formats:
          - NHWC
          - NONE
        output_tensors: 
          - decode:0
          - decode_1:0
          - decode_2:0
          - log_logits:0
        output_shapes: 
          - 64,160
          - 64,160
          - 64,160
          - 97,160
        output_data_formats:
          - NONE
          - NONE
          - NONE
          - NONE
    winograd: 0

Describe the problem

使用tensorflow 1.12版本训练的模型,转换为MACE格式,错误:“Failed to transform graph using tf tool: Transform 'remove_control_dependencies' not recognized.” 请问如何解决?谢谢!

To Reproduce

Steps to reproduce the problem:

1. cd /path/to/mace
2. python tools/converter.py convert --config_file=/path/to/your/model_deployment_file

Error information / logs

Please include the full log and/or traceback here.

 bazel-bin/mace/python/tools/converter
INFO: Elapsed time: 0.124s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action

Transform model to one that can better run on device
Run transform_graph: ['strip_unused_nodes', 'remove_nodes(op=Identity, op=CheckNumerics)', 'fold_constants(ignore_errors=true)', 'fold_batch_norms', 'fold_old_batch_norms', 'remove_control_dependencies', 'strip_unused_nodes', 'sort_by_execution_order']
output keys:  dict_keys(['decode:0', 'decode_1:0', 'decode_2:0', 'log_logits:0'])
Failed to transform graph using tf tool: Transform 'remove_control_dependencies' not recognized.
Traceback (most recent call last):
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
    return fn(*args)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
    target_list, status, run_metadata)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Retval[149] has already been set.
	 [[Node: _retval_Shape_1134_0_149 = _Retval[T=DT_INT32, index=149, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Shape_1134)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/irving/PycharmProjects/mace/bazel-bin/mace/python/tools/converter.runfiles/mace/mace/python/tools/converter.py", line 422, in <module>
    main(unused_args=[sys.argv[0]] + unparsed)
  File "/home/irving/PycharmProjects/mace/bazel-bin/mace/python/tools/converter.runfiles/mace/mace/python/tools/converter.py", line 214, in main
    option, FLAGS.model_file)
  File "/home/irving/PycharmProjects/mace/bazel-bin/mace/python/tools/converter.runfiles/mace/mace/python/tools/converter_tool/tensorflow_converter.py", line 314, in __init__
    self.update_output_shapes(session)
  File "/home/irving/PycharmProjects/mace/bazel-bin/mace/python/tools/converter.runfiles/mace/mace/python/tools/converter_tool/tensorflow_converter.py", line 382, in update_output_shapes
    feed_dict=self._placeholders)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1137, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
    options, run_metadata)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Retval[149] has already been set.
	 [[Node: _retval_Shape_1134_0_149 = _Retval[T=DT_INT32, index=149, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Shape_1134)]]
Traceback (most recent call last):
  File "tools/converter.py", line 1363, in <module>
    flags.func(flags)
  File "tools/converter.py", line 866, in convert_func
    convert_model(configs, flags.cl_mem_type)
  File "tools/converter.py", line 794, in convert_model
    ",".join(model_config.get(YAMLKeyword.graph_optimize_options, [])))
  File "/home/irving/PycharmProjects/mace/tools/sh_commands.py", line 553, in gen_model_code
    _fg=True)
  File "/home/irving/anaconda3/lib/python3.6/site-packages/sh.py", line 1413, in __call__
    raise exc
sh.ErrorReturnCode_1: 

  RAN: /home/irving/anaconda3/bin/python bazel-bin/mace/python/tools/converter -u --platform=tensorflow --model_file=/home/irving/PycharmProjects/mace/OHW_Eng_Mace/Tf_PB/OHW_Eng_Model_H64_Wnone_Traj_NHWC_8_1_24.2M_v5_top3/model_graph.pb --weight_file= --model_checksum=49df2ba2a1101a7e04d785349a925c9acf634274de9f52a0c5704baa35a3c589 --weight_checksum= --input_node=input:0,img_width:0 --input_data_types=float32,float32 --input_data_formats=NHWC,NONE --output_node=decode:0,decode_1:0,decode_2:0,log_logits:0 --output_data_types=float32,float32,float32,float32 --output_data_formats=NONE,NONE,NONE,NONE --check_node= --runtime=cpu --template=mace/python/tools --model_tag=ohw_eng_host --input_shape=64,64,1280,1:64,1 --input_range= --output_shape=64,160:64,160:64,160:97,160 --check_shape= --dsp_mode=0 --embed_model_data=False --winograd=0 --quantize=0 --quantize_large_weights=0 --quantize_range_file= --change_concat_ranges=0 --obfuscate=1 --output_dir=mace/codegen/models/ohw_eng_host --model_graph_format=file --data_type=fp32_fp32 --graph_optimize_options= --cl_mem_type=image
[[model_graph.pb.tar.gz]](https://github.com/XiaoMi/mace/files/3310525/model_graph.pb.tar.gz)

Additional context

Add any other context about the problem here, e.g., what you have modified about the code.

sheirving avatar Jun 20 '19 13:06 sheirving

Dupped with https://github.com/XiaoMi/mace/issues/286#issuecomment-449937586 What's the version of Tensorflow when running python tools/converter.py convert --config_file=/path/to/your/model_deployment_file?

lee-bin avatar Jun 21 '19 06:06 lee-bin

非常感谢您的回答!按您的提示运行“python tools/converter.py ....”脚本的tensoflow版本改为1.8.0, 模型训练和PB生成版本为1.12,但是现在运行python tools/converter.py ....”(tf r1.8.0)出现另一个问题:tensorflow.python.framework.errors_impl.InvalidArgumentError: The node 'MobileNet_CRelu_V2_Text/RNN/BiLstm_1/bw/bw/while/lstm_cell/MatMul' has inputs from different frames. The input 'ConstantFolding/MobileNet_CRelu_V2_Text/RNN/BiLstm_1/bw/lstm_cell/kernel_enter' is in frame 'MobileNet_CRelu_V2_Text/RNN/BiLstm_1/bw/bw/while/while_context'. The input 'MobileNet_CRelu_V2_Text/RNN/BiLstm_1/bw/bw/while/lstm_cell/concat' is in frame ''.? 我尝试用tensroflow 自己的tools/graph_transforms (TFTransformGraphOptions选项与mace一致),不会出现问题,期待您的回复。

sheirving avatar Jun 21 '19 09:06 sheirving

那个回答的意思是1.8.0以上就支持remove_control_dependencies了,建议你都用同一个版本1.12吧

lee-bin avatar Jun 21 '19 10:06 lee-bin

统一为版本1.12.0后遇到新问题: F tensorflow/compiler/jit/deadness_analysis.cc:639] Check failed: it != predicate_map_.end() _SINK,似乎google没有解决办法,求教?感谢

sheirving avatar Jun 21 '19 12:06 sheirving

这是在什么情况下遇到的问题?请列出详细步骤和日志。

lee-bin avatar Jun 24 '19 01:06 lee-bin

  1. Tf所有版本:r1.12.0; Bazel: 0.17.2; 其他依赖版本:基本一致
  2. 步骤:python tools/converter.py convert --config=./my_model.yml;
  3. 日志:`Starting local Bazel server and connecting to it... INFO: Analysed target //mace/python/tools:converter (21 packages loaded). INFO: Found 1 target... Target //mace/python/tools:converter up-to-date: bazel-bin/mace/python/tools/converter INFO: Elapsed time: 2.448s, Critical Path: 0.06s INFO: 0 processes. INFO: Build completed successfully, 1 total action

Transform model to one that can better run on device Run transform_graph: ['strip_unused_nodes', 'remove_nodes(op=Identity, op=CheckNumerics)', 'fold_constants(ignore_errors=true)', 'fold_batch_norms', 'fold_old_batch_norms', 'remove_control_dependencies', 'strip_unused_nodes', 'sort_by_execution_order'] output keys: dict_keys(['decode:0', 'decode_1:0', 'decode_2:0', 'log_logits:0']) 2019-06-24 09:51:10.632237: F tensorflow/compiler/jit/deadness_analysis.cc:639] Check failed: it != predicate_map_.end() _SINK Traceback (most recent call last): File "tools/converter.py", line 1363, in flags.func(flags) File "tools/converter.py", line 866, in convert_func convert_model(configs, flags.cl_mem_type) File "tools/converter.py", line 794, in convert_model ",".join(model_config.get(YAMLKeyword.graph_optimize_options, []))) File "/home/cvte-irving/PycharmProjects/mace/tools/sh_commands.py", line 553, in gen_model_code _fg=True) File "/home/cvte-irving/anaconda3/lib/python3.6/site-packages/sh.py", line 1413, in call raise exc sh.SignalException_SIGABRT:

RAN: /home/cvte-irving/anaconda3/bin/python bazel-bin/mace/python/tools/converter -u --platform=tensorflow --model_file=/home/cvte-irving/PycharmProjects/mace/OHW_Eng_Mace/Tf_PB/OHW_Eng_Model_H64_Wnone_Traj_NHWC_8_1_24.2M_v5_top3/model_graph.pb --weight_file= --model_checksum=d55015db85e8475174c024bad78ff34c9f08786761b73c13100e66926833aeb9 --weight_checksum= --input_node=input:0,img_width:0 --input_data_types=float32,float32 --input_data_formats=NHWC,NONE --output_node=decode:0,decode_1:0,decode_2:0,log_logits:0 --output_data_types=float32,float32,float32,float32 --output_data_formats=NONE,NONE,NONE,NONE --check_node= --runtime=cpu --template=mace/python/tools --model_tag=ohw_eng_host --input_shape=64,64,1280,1:64,1 --input_range= --output_shape=64,160:64,160:64,160:97,160 --check_shape= --dsp_mode=0 --embed_model_data=False --winograd=0 --quantize=0 --quantize_large_weights=0 --quantize_range_file= --change_concat_ranges=0 --obfuscate=1 --output_dir=mace/codegen/models/ohw_eng_host --model_graph_format=file --data_type=fp32_fp32 --graph_optimize_options= --cl_mem_type=image`

sheirving avatar Jun 24 '19 01:06 sheirving

是不是有一个名为_SINK的算子,看tf代码,疑似控制流相关地方的问题

yejw5 avatar Jun 24 '19 02:06 yejw5