models
models copied to clipboard
TF2 exporter doesn't support dynamic batch size
1. The entire URL of the file you are using
https://github.com/tensorflow/models/blob/master/research/object_detection/exporter_main_v2.py
2. Describe the bug
exporter_main_v2.py does not support changing the input shape. This feature was added to the tf1 exporter (export_inference_graph.py) in #2053, but looks like it's not included in the current tf2 version. As a result, all exported saved models have a fixed batch size of 1. In order to enable dynamic batching in triton inference server, we need an dynamic batch size. Please see related [issue] and recommendation for fix on triton github (https://github.com/triton-inference-server/server/issues/2097)
3. Steps to reproduce
Follow tutorial on exporting saved model https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html#exporting-a-trained-model
4. Expected behavior
exporter_main_v2.py should accept an input_shape argument and default to [None, None, None, 3] such that exported models support dynamic batching, similar to tf1 export_inference_graph.py
Hello, any updates on this issue/feature request? If I'm doing it wrong, go ahead and let me know. Just trying to get dynamic batching to work on triton inference server. @tombstone
https://github.com/tensorflow/models/blob/7beddae1ff7207e7738693cdcdec389d16be83d3/research/object_detection/exporter_lib_v2.py#L133
How about just change this line to shape=[None, None, None, 3], it worked for me.
Thanks for the reply and suggestion @lsrock1 . I checked out 7beddae and made the change. Here's the stack trace:
Traceback (most recent call last):
File "exporter_main_v2.py", line 159, in <module>
app.run(main)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "exporter_main_v2.py", line 155, in main
FLAGS.side_input_types, FLAGS.side_input_names)
File "/home/tensorflow/models/research/object_detection/exporter_lib_v2.py", line 259, in export_inference_graph
concrete_function = detection_module.__call__.get_concrete_function()
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 1167, in get_concrete_function
concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 1073, in _get_concrete_function_garbage_collected
self._initialize(args, kwargs, add_initializers_to=initializers)
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 697, in _initialize
*args, **kwds))
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2855, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3213, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3075, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 986, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 600, in wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 973, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
/home/tensorflow/models/research/object_detection/exporter_lib_v2.py:142 call_func *
return self._run_inference_on_images(input_tensor, **kwargs)
/home/tensorflow/models/research/object_detection/exporter_lib_v2.py:107 _run_inference_on_images *
detections = self._model.postprocess(prediction_dict, shapes)
/home/tensorflow/models/research/object_detection/meta_architectures/center_net_meta_arch.py:2890 postprocess *
boxes_strided, classes, scores, num_detections = (
/home/tensorflow/models/research/object_detection/meta_architectures/center_net_meta_arch.py:357 prediction_tensors_to_boxes *
heights, widths = tf.unstack(height_width, axis=2)
/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper **
return target(*args, **kwargs)
/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:1558 unstack
raise ValueError("Cannot infer num from shape %s" % value_shape)
ValueError: Cannot infer num from shape (None, 100, None)
Were there any other changes? I didn't give you my environment before, but I'm running tf2.3.0-gpu from a docker container. Dockerfile is attached.
It looks like this issue is specific to the centernet architecture. I confirmed that your change works (at least didn't get any errors, haven't tested dynamic batching in triton yet) for faster_rcnn_resnet101_v1_640x640_coco17_tpu-8, but not for centernet_resnet101
I did double check that export works fine on centernet without the code change to line 133 on exporter_lib_v2.py. Wanted to make sure I wasn't sending the wrong command.
I confirmed that your change enabled dynamic batching for the Faster RCNN architecture on Triton Inference Server. I'll link your solution there as well. Still need a fix for centerNet, though. Thanks!
I am glad that it helped!
Can we leave this open until we get a solution to the error related to centernet, though? That's the one I actually need.
Can we leave this open until we get a solution to the error related to centernet, though? That's the one I actually need.
+1
Is there a fix for this yet?
Any Updates on Batch Inference Support in Centernet Models ?
Still looking for a fix on this one.
I am also looking for a fix for this.
Met same problem, any plan to fix this? Thanks.
@chad-green , any update with centernet model? Do we have a fix yet?