seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

Newer pandas version will cause issue with pickle and seldon model loading

Open rivamarco opened this issue 3 years ago • 1 comments

Describe the bug

Deploying a MlFlow model that will use a pickle of a pandas dataframe created (and then loaded in the model in the load_context function) with pandas==1.4.4 will lead to this error:

seldon_core.wrapper:handle_generic_exception:53 - ERROR: {'status': {'status': 1, 'info': 'Model not loaded yet', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}

In particular we were able to retrieve this log putting the loading of the pickle in the predict function (done just to obtain a more precise log):

Can't get attribute '_unpickle_block' on <module 'pandas._libs.internals' from '/opt/conda/envs/mlflow/lib/python3.9/site-packages/pandas/_libs/internals.cpython-39-x86_64-linux-gnu.so'

And also we will have this log in the container of the model: seldon_core.gunicorn_utils:load:109 - DEBUG: No load method in user model.

The same model deployed with plain MlFlow will work correctly. Also, the same model but with the pickle created and loaded using pandas==1.2.3 will work correctly both in Seldon and plain MlFlow

To reproduce

  • Create a virtual environment with mlflow==1.28.0 and pandas==1.4.4. We have used python 3.9.9 but also other versions are affected
  • Create the dummy pickle running save_pickle.py that is in the attached zip file
  • Export the mlflow model running export_model.py that is in the attached zip file
  • This will create a folder not_working_pickle_pandas with the files of the model.
  • Then you can run them with Seldon and try to perform a prediction (we don't care about the body).
  • This should answer with {"value":1200} but instead we have the above error
  • If you run this with mlflow models serve -m working_pickle_pandas and try to cURL it it will work

To make it work

If we do the above steps but using pandas==1.2.3 everything will work correctly.

Environment

Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:38:33Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.7-gke.1400", GitCommit:"3cdaae9a00a0ebd5b6fe15279d5da23ced7d85ba", GitTreeState:"clean", BuildDate:"2022-06-14T09:26:54Z", GoVersion:"go1.17.10b7", Compiler:"gc", Platform:"linux/amd64"}

Seldon Images:

{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"seldon","app.kubernetes.io/instance":"seldon-core","app.kubernetes.io/name":"seldon-core-operator","app.kubernetes.io/version":"1.14.0","control-plane":"seldon-controller-manager"},"name":"seldon-controller-manager","namespace":"seldon-system"},"spec":{"replicas":1,"selector":{"matchLabels":{"app":"seldon","app.kubernetes.io/instance":"seldon1","app.kubernetes.io/name":"seldon","app.kubernetes.io/version":"v0.5","control-plane":"seldon-controller-manager"}},"template":{"metadata":{"annotations":{"prometheus.io/port":"8080","prometheus.io/scrape":"true","sidecar.istio.io/inject":"false"},"labels":{"app":"seldon","app.kubernetes.io/instance":"seldon1","app.kubernetes.io/name":"seldon","app.kubernetes.io/version":"v0.5","control-plane":"seldon-controller-manager"}},"spec":{"containers":[{"args":["--enable-leader-election","--webhook-port=4443","--create-resources=$(MANAGER_CREATE_RESOURCES)","--log-level=$(MANAGER_LOG_LEVEL)","--leader-election-id=$(MANAGER_LEADER_ELECTION_ID)",""],"command":["/manager"],"env":[{"name":"MANAGER_LEADER_ELECTION_ID","value":"a33bd623.machinelearning.seldon.io"},{"name":"MANAGER_LOG_LEVEL","value":"INFO"},{"name":"WATCH_NAMESPACE","value":""},{"name":"RELATED_IMAGE_EXECUTOR","value":""},{"name":"RELATED_IMAGE_STORAGE_INITIALIZER","value":""},{"name":"RELATED_IMAGE_SKLEARNSERVER","value":""},{"name":"RELATED_IMAGE_XGBOOSTSERVER","value":""},{"name":"RELATED_IMAGE_MLFLOWSERVER","value":""},{"name":"RELATED_IMAGE_TFPROXY","value":""},{"name":"RELATED_IMAGE_TENSORFLOW","value":""},{"name":"RELATED_IMAGE_EXPLAINER","value":""},{"name":"RELATED_IMAGE_MOCK_CLASSIFIER","value":""},{"name":"MANAGER_CREATE_RESOURCES","value":"true"},{"name":"POD_NAMESPACE","valueFrom":{"fieldRef":{"fieldPath":"metadata.namespace"}}},{"name":"CONTROLLER_ID","value":""},{"name":"AMBASSADOR_ENABLED","value":"true"},{"name":"AMBASSADOR_SINGLE_NAMESPACE","value":"false"},{"name":"PREDICTIVE_UNIT_HTTP_SERVICE_PORT","value":"9000"},{"name":"PREDICTIVE_UNIT_GRPC_SERVICE_PORT","value":"9500"},{"name":"PREDICTIVE_UNIT_DEFAULT_ENV_SECRET_REF_NAME","value":""},{"name":"PREDICTIVE_UNIT_METRICS_PORT_NAME","value":"metrics"},{"name":"ISTIO_ENABLED","value":"false"},{"name":"KEDA_ENABLED","value":"false"},{"name":"ISTIO_GATEWAY","value":"istio-system/seldon-gateway"},{"name":"ISTIO_TLS_MODE","value":""},{"name":"USE_EXECUTOR","value":"true"},{"name":"EXECUTOR_CONTAINER_IMAGE_AND_VERSION","value":"docker.io/seldonio/seldon-core-executor:1.14.0"},{"name":"EXECUTOR_CONTAINER_IMAGE_PULL_POLICY","value":"IfNotPresent"},{"name":"EXECUTOR_PROMETHEUS_PATH","value":"/prometheus"},{"name":"EXECUTOR_SERVER_PORT","value":"8000"},{"name":"EXECUTOR_CONTAINER_USER","value":"8888"},{"name":"EXECUTOR_CONTAINER_SERVICE_ACCOUNT_NAME","value":"default"},{"name":"EXECUTOR_SERVER_METRICS_PORT_NAME","value":"metrics"},{"name":"EXECUTOR_REQUEST_LOGGER_DEFAULT_ENDPOINT","value":"http://default-broker"},{"name":"EXECUTOR_REQUEST_LOGGER_WORK_QUEUE_SIZE","value":"10000"},{"name":"EXECUTOR_REQUEST_LOGGER_WRITE_TIMEOUT_MS","value":"2000"},{"name":"DEFAULT_USER_ID","value":"8888"},{"name":"EXECUTOR_DEFAULT_CPU_REQUEST","value":"500m"},{"name":"EXECUTOR_DEFAULT_MEMORY_REQUEST","value":"512Mi"},{"name":"EXECUTOR_DEFAULT_CPU_LIMIT","value":"500m"},{"name":"EXECUTOR_DEFAULT_MEMORY_LIMIT","value":"512Mi"},{"name":"DEPLOYMENT_NAME_AS_PREFIX","value":"false"},{"name":"EXECUTOR_FULL_HEALTH_CHECKS","value":"false"}],"image":"docker.io/seldonio/seldon-core-operator:1.14.0","imagePullPolicy":"IfNotPresent","name":"manager","ports":[{"containerPort":4443,"name":"webhook-server","protocol":"TCP"},{"containerPort":8080,"name":"metrics","protocol":"TCP"}],"resources":{"limits":{"cpu":"500m","memory":"300Mi"},"requests":{"cpu":"100m","memory":"200Mi"}}}],"hostNetwork":false,"priorityClassName":"","securityContext":{"runAsUser":8888},"serviceAccountName":"seldon-manager","terminationGracePeriodSeconds":10}}}}
    value: docker.io/seldonio/seldon-core-executor:1.14.0
  image: docker.io/seldonio/seldon-core-operator:1.14.0

files.zip

rivamarco avatar Sep 01 '22 10:09 rivamarco

Probably related to this: https://github.com/SeldonIO/seldon-core/issues/4332

rivamarco avatar Sep 15 '22 07:09 rivamarco

This is correct and it's the case for all prepackaged servers using python, unfortunately this is a limitation of the way pickle works, and you have to ensure full parity between your local environment that creates the artifact and the runtime environment. For this the preferred option woudl be for you to use the same version of pandas in your local env, but the alternative is for you to build your own custom version of the mlflow image and use that one instead (iwth your desired dependencies) https://docs.seldon.io/projects/seldon-core/en/latest/servers/custom.html

axsaucedo avatar Oct 19 '22 06:10 axsaucedo