BentoML bug: Http Server is not worked with Pytorch-Lightning without 'del sys.modules["prometheus

Describe the bug

benotml serve {my_service}.py:svc --port 3001 is loaded and stdout shows log

Prometheus metrics for HTTP BentoServer from "{my_service}.py:svc" can be accessed at http://localhost:3001/metrics
2023-04-25T10:49:47+0900 [INFO] [cli] Starting development HTTP BentoServer from "{my_service}.py:svc" listening on http://0.0.0.0:3001 (Press CTRL+C to quit)

However, when I access to 'http://0.0.0.0:3001', it falls in infinite loading loop without any kinds of warning or error message.

When I add 'del sys.modules["prometheus_client"]', the issue is resolved and I can access http server via localhost web.

I guess it is caused when we import 3rd party library using prometheus client (e.g. pytorch lightning)

To reproduce

I find that just adding a single line

import torch
import pytorch_lightning as pl

...

in your example of 'custom_runner/torch_hub_yolov5/service.py' triggers this bug.

(please run bentoml serve service.py:svc)

However, after add this single line, it will resolve this issue

import torch
import pytorch_lightning as pl

del sys.modules["prometheus_client"]
...

Expected behavior

No response

Environment

bentoml: 1.0.18 python: 3.8.9 pytorch-lightning: 2.0.0

Apr 25 '23 02:04 bobby-ohouse

That's strange. @aarnphm any ideas?

Apr 28 '23 01:04 sauyon

Any updates or do you find any insights? I think it might be related to your custom prometheus client object checking sys.modules separately for edge case handling?

May 08 '23 02:05 bobby-ohouse

I couldn't find reference of prometheus_client within pytorch_lightning.

Can you send your service.py definition here?

If any of the other modules in your service.py that uses the default prometheus_client, the metrics won't work since it will the client here is not setup in multiprocess mode. Read more about multiprocess mode: https://github.com/prometheus/client_python#multiprocess-mode-eg-gunicorn

May 08 '23 21:05 aarnphm

@aarnphm Unfortunately, I cannot share my service for some moments, since it is under private repository of the organization (i need to take some steps to share)

Instead, I already share how it can be reproduced with the examples from this repository (custom_runner/torch_hub_yolov5/service.py). Is it not reproduced on your env?

import torch
import pytorch_lightning as pl

del sys.modules["prometheus_client"]
...

I think it can be frequently raised if we use 'LightningModule' instead of nn.Module and use bentoml.pytorch.save_model instead of bentoml.pytorch_lightning.save_model, not to use 'torchscript'. (currently, there are lots of cases that we cannot use native torchscript when we use pretrained huggingface modules.)

May 09 '23 04:05 bobby-ohouse

No, it works perfectly fine for me.

Can you send your bentoml env -o bash?

also would be helpful if you can send the output of bentoml serve --debug?

May 09 '23 13:05 aarnphm

I am having the same problem, when I start a service using a pytorch runner like:

import bentoml
from bentoml import Service

puzzle_runner = bentoml.pytorch.get("8-puzzle:latest").to_runner()

svc = Service(
    "Test",
    runners=[puzzle_runner],
)

When opening localhost:3000 the following error is shown:

File "/path/.venv/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client
    assert (
AssertionError: prometheus_client is already imported, multiprocessing will not work properly

If I add the del sys.modules["prometheus_client"] statement everything works well:

import sys

import bentoml
from bentoml import Service

del sys.modules["prometheus_client"]

puzzle_runner = bentoml.pytorch.get("8-puzzle:latest").to_runner()

svc = Service(
    "Test",
    runners=[puzzle_runner],
)

Do you know why this is happening?

Jul 01 '23 12:07 agranadosb

I can not reproduce with the given code. Thus I can not help to find out where did the "prometheus_client" get imported. Due to the limitations of the "prometheus_client" library, if it has been imported directly or indirectly within the service.py file, BentoML can not function properly. You may need to identify where it is being imported.

Using "del sys.modules["prometheus_client"]" might make it appear to work, but the exposed "/metrics" endpoint will no longer be trustworthy.

@agranadosb

Jul 03 '23 03:07 bojiang

You may add print(sys.modules.get("prometheus_client")) in different positions to see where it is imported in the script. Like following:

import sys
print("before script:", sys.modules.get("prometheus_client"))  # 1
import bentoml
from bentoml import Service

puzzle_runner = bentoml.pytorch.get("8-puzzle:latest").to_runner()
print("after runner:", sys.modules.get("prometheus_client"))  # 2
svc = Service(
    "Test",
    runners=[puzzle_runner],
)
print("after service:", sys.modules.get("prometheus_client"))  # 3

Since I can't reproduce the issue either, 2 and 3 should be fine. I suspect it is already imported at 1.

Jul 03 '23 03:07 frostming

Hi, @aarnphm . I test recently more in my macOS (ver 12.6.3) & apple M1 Pro and tested with the most recent version (bentoml==1.0.23) & found The issue it is not resolved yet.

I add simple import before benotml like this on your example (import pytorch_lightning as pl)

import sys
import torch
import pytorch_lightning as pl

import bentoml
from bentoml.io import Image
from bentoml.io import PandasDataFrame


class Yolov5Runnable(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")
    SUPPORTS_CPU_MULTI_THREADING = True

    def __init__(self):
        self.model = torch.hub.load("ultralytics/yolov5", "yolov5s")

        if torch.cuda.is_available():
            self.model.cuda()
        else:
            self.model.cpu()

        # Config inference settings
        self.inference_size = 320

        # Optional configs
        # self.model.conf = 0.25  # NMS confidence threshold
        # self.model.iou = 0.45  # NMS IoU threshold
        # self.model.agnostic = False  # NMS class-agnostic
        # self.model.multi_label = False  # NMS multiple labels per box
        # self.model.classes = None  # (optional list) filter by class, i.e. = [0, 15, 16] for COCO persons, cats and dogs
        # self.model.max_det = 1000  # maximum number of detections per image
        # self.model.amp = False  # Automatic Mixed Precision (AMP) inference

    @bentoml.Runnable.method(batchable=True, batch_dim=0)
    def inference(self, input_imgs):
        # Return predictions only
        results = self.model(input_imgs, size=self.inference_size)
        return results.pandas().xyxy

    @bentoml.Runnable.method(batchable=True, batch_dim=0)
    def render(self, input_imgs):
        # Return images with boxes and labels
        return self.model(input_imgs, size=self.inference_size).render()


yolo_v5_runner = bentoml.Runner(Yolov5Runnable, max_batch_size=30)

svc = bentoml.Service("yolo_v5_demo", runners=[yolo_v5_runner])


@svc.api(input=Image(), output=PandasDataFrame())
async def invocation(input_img):
    batch_ret = await yolo_v5_runner.inference.async_run([input_img])
    return batch_ret[0]


@svc.api(input=Image(), output=Image())
async def render(input_img):
    batch_ret = await yolo_v5_runner.render.async_run([input_img])
    return batch_ret[0]

and show error log

2023-07-04T16:24:43+0900 [WARNING] [cli] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:43+0900 [INFO] [cli] Environ for worker 0: set CPU thread count to 10
2023-07-04T16:24:43+0900 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "service.py:svc" can be accessed at http://localhost:3000/metrics.
2023-07-04T16:24:44+0900 [INFO] [cli] Starting production HTTP BentoServer from "service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2023-07-04T16:24:49+0900 [WARNING] [api_server:4] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:49+0900 [WARNING] [api_server:6] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [runner:yolov5runnable:1] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:10] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:5] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:1] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:2] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:3] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:9] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:8] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:7] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:55+0900 [ERROR] [api_server:6] Exception in ASGI application
Traceback (most recent call last):
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 86, in __call__
    raise exc from None
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 82, in __call__
    await self.app(scope, inner_receive, inner_send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/applications.py", line 118, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
    await self.app(scope, receive, send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 82, in __call__
    self._setup()
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/simple_di/__init__.py", line 139, in _
    return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 43, in _setup
    self.metrics_request_duration = metrics_client.Histogram(
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 149, in Histogram
    return partial(self.prometheus_client.Histogram, registry=self.registry)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client
    assert (
AssertionError: prometheus_client is already imported, multiprocessing will not work properly
2023-07-04T16:24:55+0900 [ERROR] [api_server:8] Exception in ASGI application
Traceback (most recent call last):
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 86, in __call__
    raise exc from None
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 82, in __call__
    await self.app(scope, inner_receive, inner_send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/applications.py", line 118, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
    await self.app(scope, receive, send)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 82, in __call__
    self._setup()
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/simple_di/__init__.py", line 139, in _
    return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 43, in _setup
    self.metrics_request_duration = metrics_client.Histogram(
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 149, in Histogram
    return partial(self.prometheus_client.Histogram, registry=self.registry)
  File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client
    assert (
AssertionError: prometheus_client is already imported, multiprocessing will not work properly

Jul 04 '23 07:07 bobby-ohouse

And server shows internal server error

Jul 04 '23 07:07 bobby-ohouse

I'm not sure since the assertion logic is on protected file & might be related to other production issue, but simply removing assertion logic for checking prometheus import in your custom benotml prometheus log resolves this issue.

( File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client)

Jul 04 '23 07:07 bobby-ohouse

Hey, is there any chances that your models doesn't contain a signature?

Jul 16 '23 06:07 aarnphm

can you send the output of cat $(bentoml models get model:tag -o path)/model.yaml?

Jul 16 '23 06:07 aarnphm

can you send the output of cat $(bentoml models get model:tag -o path)/model.yaml?

Hey!

Sorry for the delay, here is the result:

>>> cat $(bentoml models get 8-puzzle:1.0 -o path)/model.yaml
name: 8-puzzle
version: '1.0'
module: bentoml.pytorch
labels: {}
options:
  partial_kwargs: {}
metadata: {}
context:
  framework_name: torch
  framework_versions:
    torch: 1.13.1+cu116
  bentoml_version: 1.0.23
  python_version: 3.8.10
signatures:
  __call__:
    batchable: true
    batch_dim:
    - 0
    - 0
api_version: v1
creation_time: '2023-07-01T12:01:44.893496+00:00'

Aug 02 '23 18:08 agranadosb

Wait, I removed the venv I was using and now everything seems to work properly.

Aug 02 '23 18:08 agranadosb

Hey all, apologies for bumping this issue but I recently encountered it myself and thought I would share my experience. We use wandb extensively, including as part of our BentoML deployment. It turns out that a newer version of the library (or one of it dependencies) was importing prometheus_client - reverting to wandb version 0.13.1 solved it for me.

Sep 06 '23 11:09 nicjac

If you have a custom runnable, please check there aren't any bug there as well FYI

For the yolov5 issue, please check if you have all required dependencies

you can try import the model in a python session first:

import torch
model = torch.hub.load("ultralytics/yolov5", "yolov5s")

Sep 06 '23 13:09 aarnphm

Here is a solution, TLDR: add the import of pytorch_lightning and torch after benotml

import sys
import bentoml
from bentoml.io import Image
from bentoml.io import PandasDataFrame


class Yolov5Runnable(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")
    SUPPORTS_CPU_MULTI_THREADING = True

    def __init__(self):
        import torch # Here instead!
        import pytorch_lightning as pl
        self.model = torch.hub.load("ultralytics/yolov5", "yolov5s")

Sep 06 '23 15:09 ragyhaddad

BentoML
BentoML copied to clipboard

bug: Http Server is not worked with Pytorch-Lightning without 'del sys.modules["prometheus_client"]'

Describe the bug

To reproduce

Expected behavior

Environment

BentoML BentoML copied to clipboard

bug: Http Server is not worked with Pytorch-Lightning without 'del sys.modules["prometheus_client"]'

Describe the bug

To reproduce

Expected behavior

Environment

BentoML
BentoML copied to clipboard