serving
serving copied to clipboard
Loosen model interface checks
Hello.
This patch loosens the checking of the model interface in gRPC Predict API. If the key of "request.inputs()" satisfies the required key of "signature.inputs()", the request is predictable. This allows you to add features to a running gRPC client server at any given time, even though they are not used for prediction. The advantage of this patch is that the API of the model is backwards compatible when AB testing a model with added features on a running service.
Let me know what you think.
@shkwsk i also run into the same issue. please let me know if the following is what you also mean?
my tfs model uses 10 features, but my grpc request sends 11 features (the 10 features the models needs + 1 additional feature (called dummy - which the model does not need)) --> tf serving fails with the following error:
{
"error":"Failed to process element: 0 key: dummy of \\'instances\\' list. Error: Invalid argument: JSON object: does not have named input: dummy"
}
when you want to do a/b testing as follows, lets say the grpc client sends a request with 20 features:
- modelA (needs 10 features from that request)
- modelB (needs 5 features from that request)
currently tensorflow serving would not allow that.
cc @rmothukuru
@geraldstanje Thanks for your comment.
It seems to be the same as your issue, but from the error messages you show, it looks like you're using the RESTful API. This patch loosens the check for keys in the protocol buffer map in gRPC API.
The RESTful API may be able to do the same fix, but I don't have a good idea right now. In #1622, I have modified the RESTful API to use protocol buffer, and you can use it in conjunction with this PR to solve your error messages:)
@shkwsk thanks for the pr - lets get it merged. im using grpc but only tested it with the rest api in this case... that pr will make the testing of different models possible.
@geraldstanje I don't merge that PR into this because the issues I want to solve are different. In essence, it's better to loosen the checks in RESTful API.
In any case, I'll be waiting for the official reviewer's comments.
@shkwsk i think both grpc and rest interface need to implement the same checks...
cc @misterpeddy @lilao
@geraldstanje I tried loosen the checks in the RESTful API as well. Unfortunately, I can't prepare a model to handle the instances format, so I haven't tested it. Could you please check if there is still an error?
@shkwsk ok. let me test over the weekend.
@shkwsk do you have a script to builds the tfs docker image which is based on your branch?
@geraldstanje I think you can switch branches and build as documented. https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/setup.md#build I also merged in the latest commits so you can be built.
@shkwsk i tried, but that field since Dockerfile.devel clones https://github.com/tensorflow/serving... but your code is in https://github.com/shkwsk/serving/tree/shkwsk-patch1
#!/bin/bash
USER=geri
TF_SERVING_VERSION_GIT_BRANCH=shkwsk-patch1
TAG=shkwsk-patch1
git clone --branch="${TF_SERVING_VERSION_GIT_BRANCH}" https://github.com/shkwsk/serving
cd serving && \
docker build --pull -t $USER/tensorflow-serving-devel:$TAG \
--build-arg TF_SERVING_VERSION_GIT_BRANCH="${TF_SERVING_VERSION_GIT_BRANCH}" \
-f tensorflow_serving/tools/docker/Dockerfile.devel .
cd serving && \
docker build -t $USER/tensorflow-serving:$TAG \
--build-arg TF_SERVING_BUILD_IMAGE=$USER/tensorflow-serving-devel:$TAG \
-f tensorflow_serving/tools/docker/Dockerfile .
any idea?
@geraldstanje How about rewriting the Dockerfile.devel temporarily? Could you please edit it after the git clone?
diff --git a/tensorflow_serving/tools/docker/Dockerfile.devel b/tensorflow_serving/tools/docker/Dockerfile.devel
index fb247c1..3ac4a35 100644
--- a/tensorflow_serving/tools/docker/Dockerfile.devel
+++ b/tensorflow_serving/tools/docker/Dockerfile.devel
@@ -88,8 +88,8 @@ RUN mkdir /bazel && \
# Download TF Serving sources (optionally at specific commit).
WORKDIR /tensorflow-serving
-RUN git clone --branch=${TF_SERVING_VERSION_GIT_BRANCH} https://github.com/tensorflow/serving . && \
- git remote add upstream https://github.com/tensorflow/serving.git && \
+RUN git clone --branch=${TF_SERVING_VERSION_GIT_BRANCH} https://github.com/shkwsk/serving . && \
+ git remote add upstream https://github.com/tensorflow/serving.git && \
if [ "${TF_SERVING_VERSION_GIT_COMMIT}" != "head" ]; then git checkout ${TF_SERVING_VERSION_GIT_COMMIT} ; fi
@shkwsk i tested both grpc and rest api, both worked fine.
please take a look: @misterpeddy @lilao @netfs
@geraldstanje Thanks for your testing!
@misterpeddy @lilao @netfs Please check and merge this PR?
i think having different # of features in request compared to that what model expects -- is usually an indicator of training vs serving "skew".
you are assuming here that its kosher to send request with more features than the model expects. This may be OK for your scenario, but might be an indicator of bad model pushed to serving in other cases (and you will want to fail the request in such cases, for monitoring to alert appropriately).
IOW i am not convinced that this change is safe for all situations and its OK to loosen this check.
@netfs I understand what you mean.
But I need this patch for model migration on the running service. Loosening checks allows to adding/removing features seamlessly.
If you are not convinced to loosen the validation checks by default, how about adding the option to choose to loosen?
@netfs I understand what you mean.
But I need this patch for model migration on the running service. Loosening checks allows to adding/removing features seamlessly.
why can't versioning be used to do this safely? IOW send versioned requests. And maybe use labels to make the transition usable [0].
If you are not convinced to loosen the validation checks by default, how about adding the option to choose to loosen?
[0] https://www.tensorflow.org/tfx/serving/serving_config#assigning_string_labels_to_model_versions_to_simplify_canary_and_rollback
@netfs If you want to switch the model in version, you need to distribute the configuration file to the server.
It is difficult to manage the status of all servers while distributing the file to each server. Also, the client needs to know the status of each server and the version of the model on the stateful server.
I update multiple models daily on our S3 bucket, and run AB tests between the updated models. It is not practical to switch the model version on the client side on a stateful server every day. This is very costly, so I want to switch the requests by model_name.
I update multiple models daily on our S3 bucket, and run AB tests between the updated models. It is not practical to switch the model version on the client side on a stateful server every day.
you still need to notify the client to switch its inference request to contain different set of features (corresponding to the new B version of the model) -- no? how is this switch different than asking client to also use a different explicit version?
This is very costly, so I want to switch the requests by model_name.
@netfs @shkwsk @chrisolston any progress on that?