seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

Intermediate models metadata endpoints are not accessible in seldondeployments with separate pods

Open saeid93 opened this issue 3 years ago • 2 comments

Describe the bug

I want to point a follow up on https://github.com/SeldonIO/seldon-core/issues/4092 which could be a potential bug. It seems that the intermediate models metadata endpoints are not accessible in an inference graph when the pods are deployed separately. I have made a three dummy nodes: 1. MODEL -> model-one 2. MODEL -> model-two and 3. COMBINER -> node-combiner with each node having some metadata passed by the init_metadata function. I have deployed the model in the two format:

  1. All containers in a single pod
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: combiner-single-pod
spec:
  name: combiner-single-pod
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: sdghafouri/templates:combiner
          name: node-combiner
          imagePullPolicy: Always
        - image: sdghafouri/templates:model1
          name: node-one
          imagePullPolicy: Always
        - image: sdghafouri/templates:model2
          name: node-two
          imagePullPolicy: Always
    graph:
      name: node-combiner
      type: COMBINER
      children:
      - name: node-one
        type: MODEL
        children: []   
      - name: node-two
        type: MODEL
        children: []   
    name: example
    labels:
      sidecar.istio.io/inject: "true"
    replicas: 1
  1. Containers in separate pods
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: combiner-separate-pods
spec:
  name: combiner-separate-pods
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: sdghafouri/templates:combiner
          name: node-combiner
          imagePullPolicy: Always
    - spec:
        containers:
        - image: sdghafouri/templates:model1
          name: node-one
          imagePullPolicy: Always
    - spec:
        containers: 
        - image: sdghafouri/templates:model2
          name: node-two
          imagePullPolicy: Always
    graph:
      name: node-combiner
      type: COMBINER
      children:
      - name: node-one
        type: MODEL
        children: []   
      - name: node-two
        type: MODEL
        children: []   
    name: example
    labels:
      sidecar.istio.io/inject: "true"
    replicas: 1

the metadata endpoints of the combiner and former model nodes of the first case are accessible both at node and graph level as expected: for combiner:

curl http://localhost:32000/seldon/saeid/combiner-single-pod/api/v1.0/metadata/node-combiner | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   506  100   506    0     0  16322      0 --:--:-- --:--:-- --:--:-- 16866
{
  "custom": {
    "author": "seldon-dev"
  },
  "inputs": [
    {
      "messagetype": "tensor",
      "schema": {
        "names": [
...

for models:

curl http://localhost:32000/seldon/saeid/combiner-single-pod/api/v1.0/metadata/node-one | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   506  100   506    0     0  25300      0 --:--:-- --:--:-- --:--:-- 25300
{
  "custom": {
    "author": "seldon-dev"
  },
  "inputs": [
    {
      "messagetype": "tensor",
...

However, for the separate pod case only the the combiner node metadata is available:

curl http://localhost:32000/seldon/saeid/combiner-separate-pods/api/v1.0/metadata/node-combiner | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   506  100   506    0     0  24095      0 --:--:-- --:--:-- --:--:-- 25300
{
  "custom": {
    "author": "seldon-dev"
  },
  "inputs": [
    {
      "messagetype": "tensor",
...

But the model metadata endpoints are not accessible (same with node two):

curl http://localhost:32000/seldon/saeid/combiner-separate-pods/api/v1.0/metadata/node-one | jq     
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    91  100    91    0     0      3      0  0:00:30  0:00:30 --:--:--    23
parse error: Invalid numeric literal at line 1, column 9

I also tried both cases with the seldon.io/engine-separate-pod: "true" for separate orchestrator too but the result were the same.

To reproduce

  1. images are public so deploying the above yaml files will reproduce the graphs.

Expected behaviour

The access to the endpoint should be the same in both separate and single pod deployments.

Environment

  • Cloud Provider: Microk8s
  • Kubernetes Cluster Version v1.22.9
  • Deployed Seldon System Images:
value: docker.io/seldonio/seldon-core-executor:1.13.1
image: docker.io/seldonio/seldon-core-operator:1.13.1

saeid93 avatar May 23 '22 13:05 saeid93

Not sure if this is related to correctly expose SVCs as started in this https://github.com/SeldonIO/seldon-core/pull/4043

ukclivecox avatar Jun 24 '22 06:06 ukclivecox

This issue is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jul 30 '22 02:07 github-actions[bot]

In v2 you can call metadata directly on each model.

ukclivecox avatar Dec 19 '22 11:12 ukclivecox