My Demo Bert Model Failed to Serve
I am trying to use tensorflow serving to serve a keras bert model, but I have problem to predict with rest api, below are informations. Can you please help me to resolve this problem.
predict output (ERROR)
{
"error": "Op type not registered 'TFText>RoundRobinTrim' in binary running on ljh-my-keras-bert-model. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib (e.g. tf.contrib.resampler), accessing should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed."
}
my local versions
Python 3.10.14
tensorflow 2.18.0
tensorflow-datasets 4.9.6
tensorflow-io-gcs-filesystem 0.37.1
tensorflow-metadata 1.16.1
tensorflow-text 2.18.0
keras 3.6.0
keras-hub-nightly 0.16.1.dev202410210343
keras-nlp 0.17.0
model definition
import os
os.environ["KERAS_BACKEND"] = "tensorflow" # "jax" or "tensorflow" or "torch"
import tensorflow_datasets as tfds
import keras_nlp
imdb_train, imdb_test = tfds.load(
"imdb_reviews",
split=["train", "test"],
as_supervised=True,
batch_size=16,
)
import keras
# Load a model.
classifier = keras_nlp.models.BertClassifier.from_preset(
"bert_tiny_en_uncased",
num_classes=2,
activation="softmax",
)
# Compile the model.
classifier.compile(
loss="sparse_categorical_crossentropy",
optimizer=keras.optimizers.Adam(5e-5),
metrics=["sparse_categorical_accuracy"],
jit_compile=True,
)
# Fine-tune.
classifier.fit(imdb_train.take(250), validation_data=imdb_test.take(250))
# Predict new examples.
classifier.predict(["What an amazing movie!", "A total waste of my time."])
# expected output: array([[0.34156954, 0.65843046], [0.52648497, 0.473515 ]], dtype=float32)
save the model to local path
import tensorflow as tf
import keras_nlp
def preprocess(inputs):
# Convert input strings to token IDs, padding mask, and segment IDs
preprocessor = classifier.preprocessor
encoded = preprocessor(inputs)
return {
'token_ids': encoded['token_ids'],
'padding_mask': encoded['padding_mask'],
'segment_ids': encoded['segment_ids']
}
@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string)])
def serving_fn(inputs):
preprocessed = preprocess(inputs)
outputs = classifier(preprocessed)
return outputs
# Save the model
model_export_path = "/Users/xxx/tf_saved_models/my-keras-bert-model/1"
tf.saved_model.save(
classifier,
export_dir=model_export_path,
signatures={"serving_default": serving_fn}
)
print(f"Model saved to: {model_export_path}")
build the tensorflow serving docker image
FROM tensorflow/serving:latest
COPY my-keras-bert-model /models/model
RUN ls /models/model
# Set the model environment variables
# ENV OMP_NUM_THREADS 4
# ENV TF_NUM_INTEROP_THREADS 4
# ENV TF_NUM_INTRAOP_THREADS 4
# Start TensorFlow Serving
ENTRYPOINT ["tensorflow_model_server"]
CMD ["--port=8500", "--rest_api_port=8080", "--model_name=model", "--model_base_path=/models/model"]
predict request
POST http://localhost:8080/v1/models/model/versions/1:predict Content-Type: application/json
{"instances": ["What an amazing movie!", "A total waste of my time."]}
Hi @cceasy, Thank you for reporting. I was able to reproduce the issue. I will check on this internally and update here. Below is the error:
2024-11-12 09:34:40.365027: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
E0000 00:00:1731404080.457354 107 mlir_bridge_pass_util.cc:68] Failed to parse __inference_serving_fn_19270: Op type not registered 'TFText>RoundRobinTrim' in binary running on 58d2778e1319. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib (e.g. `tf.contrib.resampler`), accessing should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
I0000 00:00:1731404080.461934 107 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
.
.
2024-11-12 09:34:40.505939: E external/org_tensorflow/tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:135] tfg_optimizer{any(tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export)} failed: INVALID_ARGUMENT: Unable to find OpDef for TFText>RoundRobinTrim
While importing function: __inference_serving_fn_19270
when importing GraphDef to MLIR module in GrapplerHook
Thank you!
I have the same issue, is there any update on this? Thanks!
I'm also experiencing this issue with serving a Gemma 3 model.
I have the same issue, any updates on this?