BentoML
BentoML copied to clipboard
bug: Http Server is not worked with Pytorch-Lightning without 'del sys.modules["prometheus_client"]'
Describe the bug
benotml serve {my_service}.py:svc --port 3001 is loaded and stdout shows log
Prometheus metrics for HTTP BentoServer from "{my_service}.py:svc" can be accessed at http://localhost:3001/metrics
2023-04-25T10:49:47+0900 [INFO] [cli] Starting development HTTP BentoServer from "{my_service}.py:svc" listening on http://0.0.0.0:3001 (Press CTRL+C to quit)
However, when I access to 'http://0.0.0.0:3001', it falls in infinite loading loop without any kinds of warning or error message.
When I add 'del sys.modules["prometheus_client"]', the issue is resolved and I can access http server via localhost web.
I guess it is caused when we import 3rd party library using prometheus client (e.g. pytorch lightning)
To reproduce
I find that just adding a single line
import torch
import pytorch_lightning as pl
...
in your example of 'custom_runner/torch_hub_yolov5/service.py' triggers this bug.
(please run bentoml serve service.py:svc)
However, after add this single line, it will resolve this issue
import torch
import pytorch_lightning as pl
del sys.modules["prometheus_client"]
...
Expected behavior
No response
Environment
bentoml: 1.0.18 python: 3.8.9 pytorch-lightning: 2.0.0
That's strange. @aarnphm any ideas?
Any updates or do you find any insights? I think it might be related to your custom prometheus client object checking sys.modules separately for edge case handling?
I couldn't find reference of prometheus_client within pytorch_lightning.
Can you send your service.py definition here?
If any of the other modules in your service.py that uses the default prometheus_client, the metrics won't work since it will the client here is not setup in multiprocess mode. Read more about multiprocess mode: https://github.com/prometheus/client_python#multiprocess-mode-eg-gunicorn
@aarnphm Unfortunately, I cannot share my service for some moments, since it is under private repository of the organization (i need to take some steps to share)
Instead, I already share how it can be reproduced with the examples from this repository (custom_runner/torch_hub_yolov5/service.py). Is it not reproduced on your env?
import torch
import pytorch_lightning as pl
del sys.modules["prometheus_client"]
...
I think it can be frequently raised if we use 'LightningModule' instead of nn.Module and use bentoml.pytorch.save_model instead of bentoml.pytorch_lightning.save_model, not to use 'torchscript'. (currently, there are lots of cases that we cannot use native torchscript when we use pretrained huggingface modules.)
No, it works perfectly fine for me.
Can you send your bentoml env -o bash
?
also would be helpful if you can send the output of bentoml serve --debug
?
I am having the same problem, when I start a service using a pytorch runner like:
import bentoml
from bentoml import Service
puzzle_runner = bentoml.pytorch.get("8-puzzle:latest").to_runner()
svc = Service(
"Test",
runners=[puzzle_runner],
)
When opening localhost:3000 the following error is shown:
File "/path/.venv/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client
assert (
AssertionError: prometheus_client is already imported, multiprocessing will not work properly
If I add the del sys.modules["prometheus_client"]
statement everything works well:
import sys
import bentoml
from bentoml import Service
del sys.modules["prometheus_client"]
puzzle_runner = bentoml.pytorch.get("8-puzzle:latest").to_runner()
svc = Service(
"Test",
runners=[puzzle_runner],
)
Do you know why this is happening?
I can not reproduce with the given code. Thus I can not help to find out where did the "prometheus_client" get imported.
Due to the limitations of the "prometheus_client" library, if it has been imported directly or indirectly within the service.py file, BentoML can not function properly. You may need to identify where it is being imported.
Using "del sys.modules["prometheus_client"]" might make it appear to work, but the exposed "/metrics" endpoint will no longer be trustworthy.
@agranadosb
You may add print(sys.modules.get("prometheus_client"))
in different positions to see where it is imported in the script. Like following:
import sys
print("before script:", sys.modules.get("prometheus_client")) # 1
import bentoml
from bentoml import Service
puzzle_runner = bentoml.pytorch.get("8-puzzle:latest").to_runner()
print("after runner:", sys.modules.get("prometheus_client")) # 2
svc = Service(
"Test",
runners=[puzzle_runner],
)
print("after service:", sys.modules.get("prometheus_client")) # 3
Since I can't reproduce the issue either, 2 and 3 should be fine. I suspect it is already imported at 1.
Hi, @aarnphm . I test recently more in my macOS (ver 12.6.3) & apple M1 Pro and tested with the most recent version (bentoml==1.0.23) & found The issue it is not resolved yet.
I add simple import before benotml like this on your example (import pytorch_lightning as pl)
import sys
import torch
import pytorch_lightning as pl
import bentoml
from bentoml.io import Image
from bentoml.io import PandasDataFrame
class Yolov5Runnable(bentoml.Runnable):
SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")
SUPPORTS_CPU_MULTI_THREADING = True
def __init__(self):
self.model = torch.hub.load("ultralytics/yolov5", "yolov5s")
if torch.cuda.is_available():
self.model.cuda()
else:
self.model.cpu()
# Config inference settings
self.inference_size = 320
# Optional configs
# self.model.conf = 0.25 # NMS confidence threshold
# self.model.iou = 0.45 # NMS IoU threshold
# self.model.agnostic = False # NMS class-agnostic
# self.model.multi_label = False # NMS multiple labels per box
# self.model.classes = None # (optional list) filter by class, i.e. = [0, 15, 16] for COCO persons, cats and dogs
# self.model.max_det = 1000 # maximum number of detections per image
# self.model.amp = False # Automatic Mixed Precision (AMP) inference
@bentoml.Runnable.method(batchable=True, batch_dim=0)
def inference(self, input_imgs):
# Return predictions only
results = self.model(input_imgs, size=self.inference_size)
return results.pandas().xyxy
@bentoml.Runnable.method(batchable=True, batch_dim=0)
def render(self, input_imgs):
# Return images with boxes and labels
return self.model(input_imgs, size=self.inference_size).render()
yolo_v5_runner = bentoml.Runner(Yolov5Runnable, max_batch_size=30)
svc = bentoml.Service("yolo_v5_demo", runners=[yolo_v5_runner])
@svc.api(input=Image(), output=PandasDataFrame())
async def invocation(input_img):
batch_ret = await yolo_v5_runner.inference.async_run([input_img])
return batch_ret[0]
@svc.api(input=Image(), output=Image())
async def render(input_img):
batch_ret = await yolo_v5_runner.render.async_run([input_img])
return batch_ret[0]
and show error log
2023-07-04T16:24:43+0900 [WARNING] [cli] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:43+0900 [INFO] [cli] Environ for worker 0: set CPU thread count to 10
2023-07-04T16:24:43+0900 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "service.py:svc" can be accessed at http://localhost:3000/metrics.
2023-07-04T16:24:44+0900 [INFO] [cli] Starting production HTTP BentoServer from "service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2023-07-04T16:24:49+0900 [WARNING] [api_server:4] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:49+0900 [WARNING] [api_server:6] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [runner:yolov5runnable:1] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:10] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:5] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:1] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:2] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:3] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:9] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:8] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:50+0900 [WARNING] [api_server:7] Using lowercased runnable class name 'yolov5runnable' for runner.
2023-07-04T16:24:55+0900 [ERROR] [api_server:6] Exception in ASGI application
Traceback (most recent call last):
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 86, in __call__
raise exc from None
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 82, in __call__
await self.app(scope, inner_receive, inner_send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/applications.py", line 118, in __call__
await self.middleware_stack(scope, receive, send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
await self.app(scope, receive, send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 82, in __call__
self._setup()
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/simple_di/__init__.py", line 139, in _
return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 43, in _setup
self.metrics_request_duration = metrics_client.Histogram(
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 149, in Histogram
return partial(self.prometheus_client.Histogram, registry=self.registry)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client
assert (
AssertionError: prometheus_client is already imported, multiprocessing will not work properly
2023-07-04T16:24:55+0900 [ERROR] [api_server:8] Exception in ASGI application
Traceback (most recent call last):
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 86, in __call__
raise exc from None
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py", line 82, in __call__
await self.app(scope, inner_receive, inner_send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/applications.py", line 118, in __call__
await self.middleware_stack(scope, receive, send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
await self.app(scope, receive, send)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 82, in __call__
self._setup()
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/simple_di/__init__.py", line 139, in _
return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 43, in _setup
self.metrics_request_duration = metrics_client.Histogram(
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 149, in Histogram
return partial(self.prometheus_client.Histogram, registry=self.registry)
File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client
assert (
AssertionError: prometheus_client is already imported, multiprocessing will not work properly
And server shows internal server error
I'm not sure since the assertion logic is on protected file & might be related to other production issue, but simply removing assertion logic for checking prometheus import in your custom benotml prometheus log resolves this issue.
( File "/opt/anaconda3/envs/data/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 50, in prometheus_client)
Hey, is there any chances that your models doesn't contain a signature?
can you send the output of cat $(bentoml models get model:tag -o path)/model.yaml
?
can you send the output of
cat $(bentoml models get model:tag -o path)/model.yaml
?
Hey!
Sorry for the delay, here is the result:
>>> cat $(bentoml models get 8-puzzle:1.0 -o path)/model.yaml
name: 8-puzzle
version: '1.0'
module: bentoml.pytorch
labels: {}
options:
partial_kwargs: {}
metadata: {}
context:
framework_name: torch
framework_versions:
torch: 1.13.1+cu116
bentoml_version: 1.0.23
python_version: 3.8.10
signatures:
__call__:
batchable: true
batch_dim:
- 0
- 0
api_version: v1
creation_time: '2023-07-01T12:01:44.893496+00:00'
Wait, I removed the venv I was using and now everything seems to work properly.
Hey all, apologies for bumping this issue but I recently encountered it myself and thought I would share my experience. We use wandb
extensively, including as part of our BentoML deployment. It turns out that a newer version of the library (or one of it dependencies) was importing prometheus_client
- reverting to wandb
version 0.13.1
solved it for me.
If you have a custom runnable, please check there aren't any bug there as well FYI
For the yolov5 issue, please check if you have all required dependencies
you can try import the model in a python session first:
import torch
model = torch.hub.load("ultralytics/yolov5", "yolov5s")
Here is a solution, TLDR: add the import of pytorch_lightning
and torch
after benotml
import sys
import bentoml
from bentoml.io import Image
from bentoml.io import PandasDataFrame
class Yolov5Runnable(bentoml.Runnable):
SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")
SUPPORTS_CPU_MULTI_THREADING = True
def __init__(self):
import torch # Here instead!
import pytorch_lightning as pl
self.model = torch.hub.load("ultralytics/yolov5", "yolov5s")