seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

Repeated ERROR log 404 Not Found: The requested URL was not found on the server

Open ramanNarasimhan77 opened this issue 3 years ago • 11 comments

We recently upgraded to Seldon-core 1.12.0 and we observe that the model serving pod has this error logged repeatedly:

2022-03-08 05:11:35,696 - seldon_core.gunicorn_utils:load:103 - INFO: Tracing not active
2022-03-08 05:11:36,940 - seldon_core.wrapper:handle_generic_exception:53 - ERROR: {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}
2022-03-08 05:11:37,938 - seldon_core.wrapper:handle_generic_exception:53 - ERROR: {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}
2022-03-08 05:11:38,937 - seldon_core.wrapper:handle_generic_exception:53 - ERROR: {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}

We are not sure what URL it is complaining about. And were able to reproduce this even with the sklearn_iris example

Please provide us with some guidance to fix this problem.

ramanNarasimhan77 avatar Mar 08 '22 05:03 ramanNarasimhan77

Seldon Deployment used :

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: test-service
spec:
  predictors:
    - name: main
      annotations:
        predictor_version: v1
      replicas: 1
      graph:
        name: model
        type: MODEL
        implementation: UNKNOWN_IMPLEMENTATION
        endpoint:
          type: REST
        children: []
      componentSpecs:
        - spec:
            containers:
              - image: <docker-registry>/sklearn-iris:1.0.0
                name: model
                imagePullPolicy: Always

The model was wrapped into docker image using steps in python_wrapping_docker

ramanNarasimhan77 avatar Mar 08 '22 05:03 ramanNarasimhan77

@ramanNarasimhan77 in order to use the latest seldon, you also need to update the base wrapper (I would suggest using 1.13.1). This means that you will have to rebuild your models using the latest 1.13.1. If you want to re-run this test you will have to use it with the more recent sklearn-iris image (aka 1.12.0 / 1.13.1). If you can try that again and report findings we'll be able to understand if it indeed is an issue. Just to confirm we run a set of integration tests as part of every release that runs all example models like this one to validate they work correctly.

axsaucedo avatar Mar 08 '22 06:03 axsaucedo

@axsaucedo I am building the image myself using the source code given in examples/models/sklearn_iris using Docker for packaging. My Dockerfile already has seldon-core==1.12.0. The tag 1.0.0 is created by me for testing and I have pushed it to a private registry accessible from my kubernetes cluster

Also regarding, this comment If you want to re-run this test you will have to use it with the more recent sklearn-iris image (aka 1.12.0 / 1.13.1).

I am only able to see 2 versions of sklearn-iris in seldonio dockerhub repository and the most recent one published is from 2020-11-01 and looks like it is packaged using s2i builder image seldonio/seldon-core-s2i-python3:1.5.0-dev . Is there a newer version that you would like me to use?

❯ skopeo list-tags docker://seldonio/sklearn-iris
{
    "Repository": "docker.io/seldonio/sklearn-iris",
    "Tags": [
        "0.1",
        "0.2"
    ]
}

❯ docker pull seldonio/sklearn-iris:0.2

❯ docker inspect seldonio/sklearn-iris:0.2 | grep Created
        "Created": "2020-11-01T15:31:04.984220054Z",
        
❯ docker inspect seldonio/sklearn-iris:0.2 | grep seldon
            "seldonio/sklearn-iris:0.2"
            "seldonio/sklearn-iris@sha256:73b45e38449a363d50ff0a19e0072b4a8e80f78177d64dec528aad87b6b714d5"
            "Image": "seldonio/seldon-core-s2i-python3:1.5.0-dev",
                "io.k8s.display-name": "seldonio/sklearn-iris:0.2",
                "io.openshift.s2i.build.image": "seldonio/seldon-core-s2i-python3:1.5.0-dev",

ramanNarasimhan77 avatar Mar 08 '22 07:03 ramanNarasimhan77

@axsaucedo I also wish to clarify that model serving works fine, I am able to send requests to the serving endpoint and get results; there is no impact to serving functionality

However, the ERROR message is being continuously logged. I think it originates from this error handler but from the logs, I am not able to get information about which URL it is trying to connect and getting a 404

ramanNarasimhan77 avatar Mar 08 '22 07:03 ramanNarasimhan77

@ramanNarasimhan77 yes you would need to package it with the latest version, as that one actually reduces the logging by default. Those images are quite old so you will have to use the latest image to wrap the container. Can you try it using the top level image with version 1.13.1 or 1.12.0?

axsaucedo avatar Mar 17 '22 10:03 axsaucedo

Hi. I am running into the same error; 2022-05-16 17:12:17,102 - seldon_core.wrapper:handle_generic_exception:53 - ERROR: {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}

I tried using the latest mlflow images; 1.13.1 and 1.14.0-dev. This error is only when I use MLFLOW_SERVER. Works fine with SKLEARN_SERVER

"""apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: {model_name}
  namespace: {namespace}
  # annotations:
  #   sidecar.istio.io/inject: "true"  
spec:
  # protocol: kfserving  # Activate v2 protocol
  name: {model_name}
  predictors:
  - componentSpecs:
    - spec:
        # We are setting high failureThreshold as installing conda dependencies
        # can take long time and we want to avoid k8s killing the container prematurely
        containers:
        - name: classifier
          image: seldonio/mlflowserver:1.14.0-dev
          livenessProbe:
            initialDelaySeconds: 120
            failureThreshold: 200
            periodSeconds: 30
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
          readinessProbe:
            initialDelaySeconds: 120
            failureThreshold: 200
            periodSeconds: 30
            successThreshold: 1
            httpGet:
              path: /health/pingl
              port: http
              scheme: HTTP
    graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: pvc://main{model_path}
      name: classifier
    name: default
    replicas: {replicas}
    svcOrchSpec: 
      env: 
        - name: SELDON_LOG_LEVEL 
          value: DEBUG
"""

Any pointers on how to resolve this?

prashanthharshangi avatar May 16 '22 17:05 prashanthharshangi

Hi. I am running into the same error; 2022-05-16 17:12:17,102 - seldon_core.wrapper:handle_generic_exception:53 - ERROR: {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}

I tried using the latest mlflow images; 1.13.1 and 1.14.0-dev. This error is only when I use MLFLOW_SERVER. Works fine with SKLEARN_SERVER

"""apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: {model_name}
  namespace: {namespace}
  # annotations:
  #   sidecar.istio.io/inject: "true"  
spec:
  # protocol: kfserving  # Activate v2 protocol
  name: {model_name}
  predictors:
  - componentSpecs:
    - spec:
        # We are setting high failureThreshold as installing conda dependencies
        # can take long time and we want to avoid k8s killing the container prematurely
        containers:
        - name: classifier
          image: seldonio/mlflowserver:1.14.0-dev
          livenessProbe:
            initialDelaySeconds: 120
            failureThreshold: 200
            periodSeconds: 30
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
          readinessProbe:
            initialDelaySeconds: 120
            failureThreshold: 200
            periodSeconds: 30
            successThreshold: 1
            httpGet:
              path: /health/pingl
              port: http
              scheme: HTTP
    graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: pvc://main{model_path}
      name: classifier
    name: default
    replicas: {replicas}
    svcOrchSpec: 
      env: 
        - name: SELDON_LOG_LEVEL 
          value: DEBUG
"""

Any pointers on how to resolve this?

Nevermind. My httpGet path was incorrect.

prashanthharshangi avatar May 17 '22 00:05 prashanthharshangi

I upgraded from seldon-1.6.0 to 1.14.0, and start seeing errors like

ERROR:seldon_core.wrapper:{'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}

, too.

There is no other code change, and I'm only using gRPC, not sure what this error is about...

zyxue avatar Jun 29 '22 20:06 zyxue

I confirmed it's from the port 9000 in this example

                "ports": [
                    {
                        "containerPort": 6000,
                        "name": "metrics",
                        "protocol": "TCP"
                    },
                    {
                        "containerPort": 9000,
                        "name": "http",
                        "protocol": "TCP"
                    },
                    {
                        "containerPort": 9500,
                        "name": "grpc",
                        "protocol": "TCP"
                    }
                ],

seeing from log

INFO:seldon_core.microservice:REST gunicorn microservice running on port 9000
INFO:seldon_core.microservice:REST metrics microservice running on port 6000

But I'm only receiving gRPC requests (9500), and NOT receiving any REST requests, is seldon-executor sending REST requests somehow internally?

zyxue avatar Jun 29 '22 20:06 zyxue

I suspect the requests may be to /ready or /live kind of endpoint, https://github.com/SeldonIO/seldon-core/blob/v1.14.0/executor/api/rest/server.go#L142-L143.

How can I disable REST completely, and even better not expose the 9000? Perhaps by setting --workers=0 similar to https://docs.seldon.io/projects/seldon-core/en/latest/python/python_server.html#running-only-rest-server-by-disabling-grpc-server?

zyxue avatar Jun 30 '22 16:06 zyxue

Is any update here? Tried Xgboost, Sklearn server, both have the same issue.

mingchen-ai-code avatar Aug 19 '22 05:08 mingchen-ai-code

Any updates on this issue?

schn-tgai-spock avatar Sep 30 '22 11:09 schn-tgai-spock

Can you try with latest 1.15 or with v2

ukclivecox avatar Dec 19 '22 11:12 ukclivecox

You can try to add FLASK_DEBUG flag to your deployment, so you able to see route, that causing a trouble.

in my case deployment looks something like this

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: model-name
  namespace: namespace-name
spec:
  name: default
  predictors:
    - componentSpecs:
        - spec:
            containers:
              - env:
                  - name: SELDON_LOG_LEVEL
                    value: DEBUG
                  - name: SELDON_DEBUG
                    value: 'True'
                  - name: FLASK_DEBUG
                    value: 'True'
                image: docker_image
                name: model-name
      graph:
        envSecretRefName: k8s-secret
        implementation: CUSTOM_WRAPPER
        logger:
          mode: all
        modelUri: 's3:bucker/path/'
        name: model-name
        type: MODEL
      name: default
      replicas: 1

so, after i deployed my service, i was able to see, that cause of errors was metrics api

2022-12-22 05:37:52,634 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:37:52] "GET /prometheus HTTP/1.1" 200 -
2022-12-22 05:37:59,953 - seldon_core.wrapper:Metrics:185 - DEBUG:  REST Metrics Request
2022-12-22 05:37:59,954 - seldon_core.metrics:collect:151 - DEBUG:  SeldonMetrics.collect called
2022-12-22 05:37:59,955 - seldon_core.metrics:collect:154 - DEBUG:  Read current metrics data from shared memory
2022-12-22 05:37:59,956 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:37:59] "GET /prometheus HTTP/1.1" 200 -
2022-12-22 05:38:07,580 - seldon_core.wrapper:handle_generic_exception:53 - ERROR:  {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}
2022-12-22 05:38:07,580 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:38:07] "[35m[1mGET /prometheus HTTP/1.1[0m" 500 -
2022-12-22 05:38:14,953 - seldon_core.wrapper:handle_generic_exception:53 - ERROR:  {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}
2022-12-22 05:38:14,954 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:38:14] "[35m[1mGET /prometheus HTTP/1.1[0m" 500 -
2022-12-22 05:38:22,587 - seldon_core.wrapper:Metrics:185 - DEBUG:  REST Metrics Request
2022-12-22 05:38:22,588 - seldon_core.metrics:collect:151 - DEBUG:  SeldonMetrics.collect called
2022-12-22 05:38:22,589 - seldon_core.metrics:collect:154 - DEBUG:  Read current metrics data from shared memory
2022-12-22 05:38:22,590 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:38:22] "GET /prometheus HTTP/1.1" 200 -
2022-12-22 05:38:29,953 - seldon_core.wrapper:Metrics:185 - DEBUG:  REST Metrics Request
2022-12-22 05:38:29,953 - seldon_core.metrics:collect:151 - DEBUG:  SeldonMetrics.collect called
2022-12-22 05:38:29,955 - seldon_core.metrics:collect:154 - DEBUG:  Read current metrics data from shared memory
2022-12-22 05:38:29,956 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:38:29] "GET /prometheus HTTP/1.1" 200 -
2022-12-22 05:38:37,632 - seldon_core.wrapper:handle_generic_exception:53 - ERROR:  {'status': {'status': 1, 'info': '404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.', 'code': -1, 'reason': 'MICROSERVICE_INTERNAL_ERROR'}}
2022-12-22 05:38:37,633 - werkzeug:_log:224 - INFO:  127.0.0.1 - - [22/Dec/2022 05:38:37] "[35m[1mGET /prometheus HTTP/1.1[0m" 500 -

im using python:3.9-slim-buster docker image and seldon-core==1.15.0 python package

hope its gonna be helpful information

bardyshev avatar Dec 22 '22 06:12 bardyshev

Closing this. Please update if still an issue with latest release.

ukclivecox avatar Mar 04 '23 10:03 ukclivecox

still an issue with the latest release for me.

espoirMur avatar Mar 05 '23 19:03 espoirMur