seldon-core
seldon-core copied to clipboard
Metadata.yaml does not work with tensorflow prepackaged server / seldon protocol
Describe the bug
Placing a metadata.yaml file with metadata information about the model in the model s3 bucket does not work when using the prepackaged tensorflow server and the seldon protocol. When exectuing curl service:/api/v1.0/metadata | jq .
this metadata (see "to reproduce" below for exact yaml) should be presented but instead I get
{
"name": "default",
"models": {
"mnist-model": {
"name": "seldonio/tfserving-proxy",
"versions": [
"1.12.0-dev"
],
"inputs": [],
"outputs": []
}
},
"graphinputs": [],
"graphoutputs": []
}
Metadata is not imported from metadata.yaml but are seemingly taken from the image name of the model container (seldonio/tfserving-proxy:1.12.0-dev). According to https://docs.seldon.io/projects/seldon-core/en/latest/referenceapis/metadata.html#prepackaged-model-servers, the metadata presented should be from metadata.yaml.
To reproduce
I run the mnist example from https://docs.seldon.io/projects/seldon-core/en/latest/servers/tensorflow.html with an extra metadata.yaml file in the bucket.
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: tfserving
spec:
name: mnist
predictors:
- graph:
children: []
implementation: TENSORFLOW_SERVER
modelUri: s3://seldon-models/tfserving/mnist-model-with-metadata
storageInitializerImage: r-clone-with-org-certs:latest
name: mnist-model
parameters:
- name: signature_name
type: STRING
value: predict_images
- name: model_name
type: STRING
value: mnist-model
name: default
replicas: 1
Metadata.yaml
name: mnist-model
versions: [1]
platform: tensorflow
Expected behaviour
curl mnist-model-default:8000/api/v1.0/metadata | jq . should yield
{
"name": "default",
"models": {
"mnist-model": {
"name": "mnist-model",
"platform": "tensorflow",
"versions": [
"1"
],
"inputs": [],
"outputs": []
}
},
"graphinputs": [],
"graphoutputs": []
}
Environment
Seldon 1.12.0-dev
- Cloud Provider: On prem
- Kubernetes Cluster Version v1.21.1
- Deployed Seldon System Images:
value: docker.io/seldonio/engine:1.12.0-dev value: seldonio/seldon-core-executor:1.12.0-dev image: seldonio/seldon-core-operator:1.12.0-dev
Model Details
kubectl logs tfserving-mnist-default-0-mnist-model-54cb7954f9-cst2r -c mnist-model
starting microservice
2021-12-09 08:27:06.634624: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-12-09 08:27:06.634664: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-12-09 08:27:09,288 - seldon_core.microservice:main:211 - INFO: Starting microservice.py:main
2021-12-09 08:27:09,288 - seldon_core.microservice:main:212 - INFO: Seldon Core version: 1.12.0-dev
2021-12-09 08:27:09,291 - seldon_core.microservice:main:367 - INFO: Parse JAEGER_EXTRA_TAGS []
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation cni.projectcalico.org/containerID:92ef0cf2ff16b665f0f2057a1f901396bdc27c6072898eb422e0067d0ae93d48
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation cni.projectcalico.org/podIP:10.42.8.71/32
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation cni.projectcalico.org/podIPs:10.42.8.71/32
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation kubernetes.io/config.seen:2021-12-09T08:27:02.472518348Z
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation kubernetes.io/config.source:api
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation prometheus.io/path:/prometheus
2021-12-09 08:27:09,291 - seldon_core.microservice:load_annotations:163 - INFO: Found annotation prometheus.io/scrape:true
2021-12-09 08:27:09,291 - seldon_core.microservice:main:370 - INFO: Annotations: {'cni.projectcalico.org/containerID': '92ef0cf2ff16b665f0f2057a1f901396bdc27c6072898eb422e0067d0ae93d48', 'cni.projectcalico.org/podIP': '10.42.8.71/32', 'cni.projectcalico.org/podIPs': '10.42.8.71/32', 'kubernetes.io/config.seen': '2021-12-09T08:27:02.472518348Z', 'kubernetes.io/config.source': 'api', 'prometheus.io/path': '/prometheus', 'prometheus.io/scrape': 'true'}
2021-12-09 08:27:09,291 - seldon_core.microservice:main:374 - INFO: Importing TfServingProxy
2021-12-09 08:27:09,322 - seldon_core.microservice:main:463 - INFO: REST gunicorn microservice running on port 9000
2021-12-09 08:27:09,323 - seldon_core.microservice:main:557 - INFO: REST metrics microservice running on port 6000
2021-12-09 08:27:09,323 - seldon_core.microservice:main:567 - INFO: Starting servers
2021-12-09 08:27:09,332 - seldon_core.microservice:grpc_prediction_server:520 - INFO: GRPC Server Binding to '%s' 0.0.0.0:9500 with 1 processes
2021-12-09 08:27:09,335 - seldon_core.microservice:rest_prediction_server:448 - INFO: Gunicorn Config: {'bind': '0.0.0.0:9000', 'accesslog': None, 'loglevel': 'info', 'timeout': 5000, 'threads': 1, 'workers': 1, 'max_requests': 0, 'max_requests_jitter': 0, 'post_worker_init': <function post_worker_init at 0x7f4f755a3320>, 'worker_exit': functools.partial(<function worker_exit at 0x7f4f75530f80>, seldon_metrics=<seldon_core.metrics.SeldonMetrics object at 0x7f4f752be590>), 'keepalive': 2}
2021-12-09 08:27:09,339 - seldon_core.wrapper:_set_flask_app_configs:224 - INFO: App Config: <Config {'ENV': 'production', 'DEBUG': False, 'TESTING': False, 'PROPAGATE_EXCEPTIONS': None, 'PRESERVE_CONTEXT_ON_EXCEPTION': None, 'SECRET_KEY': None, 'PERMANENT_SESSION_LIFETIME': datetime.timedelta(days=31), 'USE_X_SENDFILE': False, 'SERVER_NAME': None, 'APPLICATION_ROOT': '/', 'SESSION_COOKIE_NAME': 'session', 'SESSION_COOKIE_DOMAIN': None, 'SESSION_COOKIE_PATH': None, 'SESSION_COOKIE_HTTPONLY': True, 'SESSION_COOKIE_SECURE': False, 'SESSION_COOKIE_SAMESITE': None, 'SESSION_REFRESH_EACH_REQUEST': True, 'MAX_CONTENT_LENGTH': None, 'SEND_FILE_MAX_AGE_DEFAULT': datetime.timedelta(seconds=43200), 'TRAP_BAD_REQUEST_ERRORS': None, 'TRAP_HTTP_EXCEPTIONS': False, 'EXPLAIN_TEMPLATE_LOADING': False, 'PREFERRED_URL_SCHEME': 'http', 'JSON_AS_ASCII': True, 'JSON_SORT_KEYS': True, 'JSONIFY_PRETTYPRINT_REGULAR': False, 'JSONIFY_MIMETYPE': 'application/json', 'TEMPLATES_AUTO_RELOAD': None, 'MAX_COOKIE_SIZE': 4093}>
2021-12-09 08:27:09,339 - seldon_core.microservice:_run_grpc_server:475 - INFO: Starting new GRPC server with 1.
2021-12-09 08:27:09,342 - seldon_core.wrapper:_set_flask_app_configs:224 - INFO: App Config: <Config {'ENV': 'production', 'DEBUG': False, 'TESTING': False, 'PROPAGATE_EXCEPTIONS': None, 'PRESERVE_CONTEXT_ON_EXCEPTION': None, 'SECRET_KEY': None, 'PERMANENT_SESSION_LIFETIME': datetime.timedelta(days=31), 'USE_X_SENDFILE': False, 'SERVER_NAME': None, 'APPLICATION_ROOT': '/', 'SESSION_COOKIE_NAME': 'session', 'SESSION_COOKIE_DOMAIN': None, 'SESSION_COOKIE_PATH': None, 'SESSION_COOKIE_HTTPONLY': True, 'SESSION_COOKIE_SECURE': False, 'SESSION_COOKIE_SAMESITE': None, 'SESSION_REFRESH_EACH_REQUEST': True, 'MAX_CONTENT_LENGTH': None, 'SEND_FILE_MAX_AGE_DEFAULT': datetime.timedelta(seconds=43200), 'TRAP_BAD_REQUEST_ERRORS': None, 'TRAP_HTTP_EXCEPTIONS': False, 'EXPLAIN_TEMPLATE_LOADING': False, 'PREFERRED_URL_SCHEME': 'http', 'JSON_AS_ASCII': True, 'JSON_SORT_KEYS': True, 'JSONIFY_PRETTYPRINT_REGULAR': False, 'JSONIFY_MIMETYPE': 'application/json', 'TEMPLATES_AUTO_RELOAD': None, 'MAX_COOKIE_SIZE': 4093}>
[2021-12-09 08:27:09 +0000] [108] [INFO] Starting gunicorn 20.1.0
[2021-12-09 08:27:09 +0000] [108] [INFO] Listening at: http://0.0.0.0:6000 (108)
[2021-12-09 08:27:09 +0000] [108] [INFO] Using worker: sync
[2021-12-09 08:27:09 +0000] [127] [INFO] Booting worker with pid: 127
[2021-12-09 08:27:09 +0000] [7] [INFO] Starting gunicorn 20.1.0
[2021-12-09 08:27:09 +0000] [7] [INFO] Listening at: http://0.0.0.0:9000 (7)
[2021-12-09 08:27:09 +0000] [7] [INFO] Using worker: sync
[2021-12-09 08:27:09 +0000] [134] [INFO] Booting worker with pid: 134
2021-12-09 08:27:09,371 - seldon_core.gunicorn_utils:load:103 - INFO: Tracing not active
The prepacked Tensorflow server with the Seldon protocol uses a proxy.
I think we would need to extend this to ensure the metadata is handled by the proxy. It may not presently have access to the downloaded artifacts.
Access to artifacts may be one thing, but we should also extend TfServingProxy
class to implement init_metadata
like we have here for example https://github.com/SeldonIO/seldon-core/blob/master/servers/sklearnserver/sklearnserver/SKLearnServer.py#L53-L66
SKLearnServer
server do get model_uri
as one of the parameters, the TfServingProxy
does not.
Closing