serving
serving copied to clipboard
Tensorflow Serving fails to serve tflite model with multiple signatures
Bug Report
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04 / Kubernetes on AWS
- TensorFlow Serving installed from (source or binary): binary
- TensorFlow Serving version: 2.8.2
Describe the problem
I am playing with quantization on a model with multiple signatures, but when I try to serve it with Tensorflow Serving, the multiple signatures are not recognized and a single default invalid one is generated.
Exact Steps to Reproduce
Here is a minimal code to reproduce the problem:
- Generate a model:
import tensorflow as tf
import os
SAVED_MODEL_PATH = os.path.join(os.path.dirname(__file__), 'model', '1')
TFLITE_MODEL_PATH = os.path.join(SAVED_MODEL_PATH, 'model.tflite')
class Model(tf.Module):
@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.float32)])
def encode(self, x):
result = tf.strings.as_string(x)
return {
"encoded_result": result
}
@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string)])
def decode(self, x):
result = tf.strings.to_number(x)
return {
"decoded_result": result
}
# Build and save model with 2 signatures
model = Model()
tf.saved_model.save(model, SAVED_MODEL_PATH,
signatures={
'encode': model.encode.get_concrete_function(),
'decode': model.decode.get_concrete_function()
})
# Convert the saved model using TFLiteConverter and save
converter = tf.lite.TFLiteConverter.from_saved_model(
SAVED_MODEL_PATH,
signature_keys=['encode', 'decode']
)
converter.target_spec.supported_ops = [
tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
tflite_model = converter.convert()
with open(TFLITE_MODEL_PATH, 'wb') as f:
f.write(tflite_model)
# Check saved model has multiple signatures:
# $ saved_model_cli show --all --dir model/1
# Check saved tflite model has multiple signatures
interpreter = tf.lite.Interpreter(model_path=TFLITE_MODEL_PATH)
signatures = interpreter.get_signature_list()
print(signatures)
One can confirm that the model has 2 signatures that a tf.lite.Interpreter will recognize, the stdout:
{'decode': {'inputs': ['x'], 'outputs': ['decoded_result']}, 'encode': {'inputs': ['x'], 'outputs': ['encoded_result']}}
- Launch tensorflow serving with that tf lite model
docker run -t --rm -p 8501:8501 -v "$PWD/model:/models/model" -e MODEL_NAME=model tensorflow/serving:2.8.2 --prefer_tflite_model=true
One should see in the logs:
... W tensorflow_serving/servables/tensorflow/tflite_session.cc:485] No signature def found in TFLite model. Generating one.
- You may also check via REST interface that only a
serving_default
entrypoint which is a wrappeddecode
(i.e. inputs aredecode_x
instead ofx
) is available
curl -X GET http://localhost:8501/v1/models/model/versions/1/metadata
First investigations
I reproduce the problem with TFServing 2.9.3 and 2.10.0.
If I use the model used for unit tests in the repo (https://github.com/tensorflow/serving/tree/master/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_tflite_with_sigdef/00000123), I do not have the problem. However, when I check the binary, I see that the headers are different (the latter has some signature_defs_metadata
field that I do not find in the former freshly created model.
The unit test model (saved_model_half_plus_two_tflite_with_sigdef) seems to have been generated with an old code (old version or TF: https://github.com/tensorflow/serving/blob/ef6c4d90ad98dff3507f5af5aa75eab809524a9e/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two.py)
It seems to me that the following little line does not do its job correctly: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/servables/tensorflow/tflite_session.cc#L452
Thank you.
@flesaint,
Unfortunately adding logic to allow serving metadata other than signaturedefs is not on the roadmap right now. To support custom metadata in Serving, there is a similar feature request https://github.com/tensorflow/serving/issues/1248 in works . I would suggest you to +1 that issue and follow the issue for updates.
Meanwhile, can you please provide us the 'saved_model_cli' result and try to serve the tensorflow model instead of tflite model and share the finding with us to debug the issue. Thank you!
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.
Sorry, I am away from a proper setup to give you the provided infos, but will be back asap.
But, I only want multiple signatures, no extra custom metadata.
@flesaint,
Please help us with the 'saved_model_cli' result for your served model to debug the issue further. Thanks!
Hello,
Using tensorflow:2.8.2
with the model.tflite
file in model/1
, it gives:
saved_model_cli show --all --dir model/1
gives
2023-05-03 16:35:09.422809: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-05-03 16:35:09.422838: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['__saved_model_init_op']:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs['__saved_model_init_op'] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is:
signature_def['decode']:
The given SavedModel SignatureDef contains the following input(s):
inputs['x'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: decode_x:0
The given SavedModel SignatureDef contains the following output(s):
outputs['decoded_result'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: PartitionedCall:0
Method name is: tensorflow/serving/predict
signature_def['encode']:
The given SavedModel SignatureDef contains the following input(s):
inputs['x'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: encode_x:0
The given SavedModel SignatureDef contains the following output(s):
outputs['encoded_result'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: PartitionedCall_1:0
Method name is: tensorflow/serving/predict
Concrete Functions:
Function Name: 'decode'
Option #1
Callable with:
Argument #1
x: TensorSpec(shape=(None,), dtype=tf.string, name='x')
Function Name: 'encode'
Option #1
Callable with:
Argument #1
x: TensorSpec(shape=(None,), dtype=tf.float32, name='x')
But a curl on the tensorflow/serving docker gives only a "serving_default" signature
curl -X GET http://localhost:8501/v1/models/model/versions/1/metadata
gives
{
"model_spec":{
"name": "model",
"signature_name": "",
"version": "1"
}
,
"metadata": {"signature_def": {
"signature_def": {
"serving_default": {
"inputs": {
"decode_x": {
"dtype": "DT_STRING",
"tensor_shape": {
"dim": [
{
"size": "1",
"name": ""
}
],
"unknown_rank": false
},
"name": "decode_x:0"
}
},
"outputs": {
"PartitionedCall": {
"dtype": "DT_FLOAT",
"tensor_shape": {
"dim": [
{
"size": "1",
"name": ""
}
],
"unknown_rank": false
},
"name": "PartitionedCall:0"
}
},
"method_name": "tensorflow/serving/predict"
}
}
}
}
}
After removing the model/1/model.tflite
file:
saved_model_cli show --all --dir model/1
gives the same result:
2023-05-03 16:38:28.821507: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-05-03 16:38:28.821528: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['__saved_model_init_op']:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs['__saved_model_init_op'] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is:
signature_def['decode']:
The given SavedModel SignatureDef contains the following input(s):
inputs['x'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: decode_x:0
The given SavedModel SignatureDef contains the following output(s):
outputs['decoded_result'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: PartitionedCall:0
Method name is: tensorflow/serving/predict
signature_def['encode']:
The given SavedModel SignatureDef contains the following input(s):
inputs['x'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: encode_x:0
The given SavedModel SignatureDef contains the following output(s):
outputs['encoded_result'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: PartitionedCall_1:0
Method name is: tensorflow/serving/predict
Concrete Functions:
Function Name: 'decode'
Option #1
Callable with:
Argument #1
x: TensorSpec(shape=(None,), dtype=tf.string, name='x')
Function Name: 'encode'
Option #1
Callable with:
Argument #1
x: TensorSpec(shape=(None,), dtype=tf.float32, name='x')
However a curl on the tensorflow/serving docker gives a different result with 2 signatures "decode" and "encode".
curl -X GET http://localhost:8501/v1/models/model/versions/1/metadata
gives
{
"model_spec":{
"name": "model",
"signature_name": "",
"version": "1"
}
,
"metadata": {"signature_def": {
"signature_def": {
"decode": {
"inputs": {
"x": {
"dtype": "DT_STRING",
"tensor_shape": {
"dim": [
{
"size": "-1",
"name": ""
}
],
"unknown_rank": false
},
"name": "decode_x:0"
}
},
"outputs": {
"decoded_result": {
"dtype": "DT_FLOAT",
"tensor_shape": {
"dim": [
{
"size": "-1",
"name": ""
}
],
"unknown_rank": false
},
"name": "PartitionedCall:0"
}
},
"method_name": "tensorflow/serving/predict"
},
"__saved_model_init_op": {
"inputs": {},
"outputs": {
"__saved_model_init_op": {
"dtype": "DT_INVALID",
"tensor_shape": {
"dim": [],
"unknown_rank": true
},
"name": "NoOp"
}
},
"method_name": ""
},
"encode": {
"inputs": {
"x": {
"dtype": "DT_FLOAT",
"tensor_shape": {
"dim": [
{
"size": "-1",
"name": ""
}
],
"unknown_rank": false
},
"name": "encode_x:0"
}
},
"outputs": {
"encoded_result": {
"dtype": "DT_STRING",
"tensor_shape": {
"dim": [
{
"size": "-1",
"name": ""
}
],
"unknown_rank": false
},
"name": "PartitionedCall_1:0"
}
},
"method_name": "tensorflow/serving/predict"
}
}
}
}
}
Thanks for your help
@flesaint,
Interesting to see the TF Serving identifies multiple signature for Tensorflow model, but fails to do so for tflite model. Thank you for providing the "saved_model_cli" and model server metadata. Let me discuss the same with team and we will get back to you.