seldon-core
seldon-core copied to clipboard
Service Orchestration not working with multiple HuggingFace models
Describe the bug
I created a Seldon Deployment with the following configuration
---
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: test-deployment
spec:
protocol: v2
predictors:
- graph:
name: transformer
modelUri: s3://xxxxxxxxxx
implementation: HUGGINGFACE_SERVER
envSecretRefName: seldon-init-container-secret
logger:
mode: all
name: default
replicas: 1
The S3 bucket contains 2 models and they are correctly loaded by the main container. Trying to reach the model, I get this error:
curl -s http://localhost:60253/seldon/default/francesco/v2/models/distilgpt2
{"status":{"code":500,"info":"Failed to find model distilgpt2","status":"FAILURE"}}%
To reproduce
- Deploy like above, from a location that includes multiple model definitions
Expected behavior
If I include the annotation seldon.io/no-engine: "true" to the deployment, it works
curl -s http://localhost:60253/seldon/default/francesco/v2/models/distilgpt2
{"name":"distilgpt2","versions":[],"platform":"","inputs":[],"outputs":[],"parameters":{"content_type":null,"headers":null}}%