tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

Could not infer the attribute type from the elements of the passed Iterable value

Open guarin opened this issue 2 years ago • 11 comments

Describe the bug

Hi! I am trying to convert a model to onnx but running into a Could not infer the attribute type from the elements of the passed Iterable value error (stacktrace is below). The error happens because a TensorShapeProto object is passed as value to make_attribute and is raised here.

I am unsure whether this is a bug in the model code, in tf2onnx, or onnx itself. Any help would be appreciated :)

Command:

python -m tf2onnx.convert --saved-model model --output model.onnx --verbose

Stacktrace:

, ex=Could not infer the attribute type from the elements of the passed Iterable value.
Traceback (most recent call last):
  File "/opt/conda/envs/onnx/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/envs/onnx/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/convert.py", line 714, in <module>
    main()
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/convert.py", line 273, in main
    model_proto, _ = _convert_common(
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/convert.py", line 168, in _convert_common
    g = process_tf_graph(tf_graph, const_node_values=const_node_values,
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/tfonnx.py", line 459, in process_tf_graph
    main_g, subgraphs = graphs_from_tf(tf_graph, input_names, output_names, shape_override, const_node_values,
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/tfonnx.py", line 474, in graphs_from_tf
    ordered_func = resolve_functions(tf_graph)
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/tf_loader.py", line 778, in resolve_functions
    _, _, _, _, _, tfunctions = tflist_to_onnx(func, {})
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/tf2onnx/tf_utils.py", line 462, in tflist_to_onnx
    onnx_node = helper.make_node(node_type, input_names, output_names, name=node.name, **attr)
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/onnx/helper.py", line 163, in make_node
    node.attribute.extend(
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/onnx/helper.py", line 164, in <genexpr>
    make_attribute(key, value)
  File "/opt/conda/envs/onnx/lib/python3.10/site-packages/onnx/helper.py", line 881, in make_attribute
    raise ValueError(
ValueError: Could not infer the attribute type from the elements of the passed Iterable value.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 18.04*): Ubuntu 20.04
  • TensorFlow Version: 2.14.0 (the model was also generated with tensorflow 2.14)
  • Python version: 3.10
  • ONNX version (if applicable, e.g. 1.11*): 1.14.1
  • ONNXRuntime version (if applicable, e.g. 1.11*): None
  • ts2onnx version: 1.15.1

Full output logs logs.txt

guarin avatar Oct 25 '23 15:10 guarin

Is it possible to share your code so we can have a local debug?

fatcat-z avatar Oct 27 '23 10:10 fatcat-z

The model is from this repository https://github.com/google-research/scenic/tree/main/scenic/projects/owl_vit The model code is here: https://github.com/google-research/scenic/blob/30374f97f8fb74a25bdd6220dee2e7f29fb509da/scenic/projects/owl_vit/models.py#L74 And this is a colab to export the model as tensorflow saved model: https://colab.research.google.com/drive/1NI5-5AjxtdbeV9b5uKZPQYwnGxYc5Uak?usp=sharing

Hope this helps, thanks a lot for looking into this!

Let me know if you need anything else :)

guarin avatar Oct 27 '23 11:10 guarin

Same issue, any updates on this?

kjabon avatar Nov 30 '23 21:11 kjabon

The model is from this repository https://github.com/google-research/scenic/tree/main/scenic/projects/owl_vit The model code is here: https://github.com/google-research/scenic/blob/30374f97f8fb74a25bdd6220dee2e7f29fb509da/scenic/projects/owl_vit/models.py#L74 And this is a colab to export the model as tensorflow saved model: https://colab.research.google.com/drive/1NI5-5AjxtdbeV9b5uKZPQYwnGxYc5Uak?usp=sharing

Hope this helps, thanks a lot for looking into this!

Let me know if you need anything else :)

Could you please attach a model file generated after calling convert_and_save_model() method so I can take a local debug?

fatcat-z avatar Dec 28 '23 11:12 fatcat-z

Same issue, any updates on this?

Could you please attach a model file where you met this issue and share me with your errors?

fatcat-z avatar Dec 28 '23 11:12 fatcat-z

Hi @fatcat-z, you can find a zip of the model here: https://drive.google.com/file/d/1KOfAgq_D7b-HScZU_H6B18nQtWnHAmlL/view?usp=drive_link

guarin avatar Jan 02 '24 13:01 guarin

Hi @fatcat-z, you can find a zip of the model here: https://drive.google.com/file/d/1KOfAgq_D7b-HScZU_H6B18nQtWnHAmlL/view?usp=drive_link

Thanks for your sharing and it helps on the debug.

The model contains some ops like XlaCallModule which were not supported by tf2onnx and this issue was raised when ONNX tries to retrieve its attributes, and this is expected. We don't have a good way to handle ops like XlaCallModule which was generated by jax.

Did you generate this module by calling jax2tf.convert() method on an existing tensorflow model? If so, could you please try to set enable_xla=False when calling jax2tf.convert() method? And try to convert this new model to ONNX by calling tf2onnx to see if it works.

fatcat-z avatar Jan 04 '24 12:01 fatcat-z

Thanks for looking into it!

Did you generate this module by calling jax2tf.convert() method on an existing tensorflow model? If so, could you please try to set enable_xla=False when calling jax2tf.convert() method? And try to convert this new model to ONNX by calling tf2onnx to see if it works.

I just tried running the colab with setting enable_xla=False but the conversion to tensorflow doesn't seem to work, will have to look into it in detail:


NotImplementedError: Call to gather cannot be converted with enable_xla=False. Unsupported arguments for gather: operand shape=(100*batch, 16, 512), start_indices=Tensor("jax2tf_predict_fn_/TextZeroShotDetectionModule/TextZeroShotDetectionModule.text_embedder/backbone/clip/clip.encode_text/text/concat:0", shape=(None, 2), dtype=int32), dimension_numbes=GatherDimensionNumbers(offset_dims=(1,), collapsed_slice_dims=(0, 1), start_index_map=(0, 1)), slice_sizes=(1, 1, 512), errors:
<function _gather_for_scalar_indexing at 0x785343c0fd00>: ValueError('start_indices shape should be 1')
<function _gather_for_multidim_indexing at 0x785343c0feb0>: ValueError('unsupported dimension numbers')
<function _gather_with_batch_dim at 0x7853439e8160>: ValueError("Dimensions must be equal, but are 2 and 3 for '{{node jax2tf_predict_fn_/TextZeroShotDetectionModule/TextZeroShotDetectionModule.text_embedder/backbone/clip/clip.encode_text/text/clip_by_value/Minimum}} = Minimum[T=DT_INT32](jax2tf_predict_fn_/TextZeroShotDetectionModule/TextZeroShotDetectionModule.text_embedder/backbone/clip/clip.encode_text/text/concat, jax2tf_predict_fn_/TextZeroShotDetectionModule/TextZeroShotDetectionModule.text_embedder/backbone/clip/clip.encode_text/text/Sub)' with input shapes: [?,2], [3].")
<function _gather_with_batch_dims at 0x7853439e8310>: ValueError('only len(collapsed_slice_dims) == 0 is supported') - See source code for the precise conditions under which it can be converted without XLA.

guarin avatar Jan 09 '24 10:01 guarin

Did you generate this module by calling jax2tf.convert() method on an existing tensorflow model? If so, could you please try to set enable_xla=False when calling jax2tf.convert() method? And try to convert this new model to ONNX by calling tf2onnx to see if it works.

Unfortunately this isn't feasible for me as some xla functions are required (without major refactoring). However, to confirm, yes this is what I was doing that was causing my issue. Too bad onnx can't handle xla seamlessly, it would open a lot of doors :(

kjabon avatar Jan 17 '24 14:01 kjabon

Did you generate this module by calling jax2tf.convert() method on an existing tensorflow model? If so, could you please try to set enable_xla=False when calling jax2tf.convert() method? And try to convert this new model to ONNX by calling tf2onnx to see if it works.

Unfortunately this isn't feasible for me as some xla functions are required (without major refactoring). However, to confirm, yes this is what I was doing that was causing my issue. Too bad onnx can't handle xla seamlessly, it would open a lot of doors :(

IIUC, XLA is something to help on improving the performance of an TensorFlow model, while converting a TF model to ONNX has the same purpose. What XLA did is special, not like a common op which is generic among different DL framework like PyTorch and ONNX, so we can't have a corresponding ONNX op for its function.

fatcat-z avatar Jan 18 '24 02:01 fatcat-z

^All good points. Yeah, both jax and tf use the xla compiler for that. Part of the issue here is that jax2tf doesn’t fully support conversions from all stablehlo ops (what jax natively outputs and xla can consume), to tf ops, so it has to export them in xla.

Though I’ve since worked around jax->tf->onnx by instead doing jax->tf->shared library (a lot of work), to your point, it seems like what I and others actually need to be able to stay in high level frameworks is a “onnx/jax-onnx” tool, or a more feature-complete jax2tf. Though, I’m holding my breath for neither.

kjabon avatar Jan 18 '24 14:01 kjabon