BentoML
BentoML copied to clipboard
bug: Async Return Latency Issues with BentoML Image IO API
Describe the bug
Hello,
I'm facing an issue with BentoML API serving where significant delays occur during the async return of images.
Here’s the simplified code:
from io import BytesIO
import time
from bentoml.io import Multipart, File, Image
from PIL import Image as PILImage
@service.api(
input=Multipart(data=File()),
output=Image(mime_type="image/png"),
route="/inpaint/test/wrinkles",
)
async def api_test(data: BytesIO):
start_time = time.time()
image = PILImage.open(BytesIO(data.read())).convert("RGB")
print("Image loaded in", time.time() - start_time, "seconds")
return image
Loading the image is efficient (0.08-0.1 seconds), but returning it asynchronously incurs a delay of up to 10 seconds. Attempting to resolve this with a runner leads to a format error:
Traceback (most recent call last):
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/server/http_app.py", line 334, in api_func
output = await api.func(**input_data)
File "/root/snowflake/backend/python/firm/services/inpaint/service.py", line 215, in api
return await post_process_runner.forward.async_run(source=image, target=output_image)
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 56, in async_run
return await self.runner._runner_handle.async_run_method(self, *args, **kwargs)
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 201, in async_run_method
payload_params = Params[Payload](*args, **kwargs).map(
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/utils.py", line 65, in map
kwargs = {k: function(v) for k, v in self.kwargs.items()}
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/utils.py", line 65, in <dictcomp>
kwargs = {k: function(v) for k, v in self.kwargs.items()}
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/container.py", line 700, in to_payload
return container_cls.to_payload(batch, batch_dim)
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/container.py", line 490, in to_payload
batch.save(buffer, format=batch.format)
File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/PIL/Image.py", line 2546, in save
raise ValueError(msg) from e
ValueError: unknown file extension:
How can I properly handle async returns with bentoml.io.Image() to avoid these delays?
Thank you for your assistance.
To reproduce
No response
Expected behavior
No response
Environment
Environment variable
BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''
System information
bentoml: 1.2.16
python: 3.10.14
platform: Linux-4.18.0-425.19.2.el8_7.x86_64-x86_64-with-glibc2.31
uid_gid: 0:0
conda: 23.5.0
in_conda_env: True
conda_packages
name: firm
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- bzip2=1.0.8=h5eee18b_6
- ca-certificates=2024.3.11=h06a4308_0
- ld_impl_linux-64=2.38=h1181459_1
- libffi=3.4.4=h6a678d5_1
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- libuuid=1.41.5=h5eee18b_0
- ncurses=6.4=h6a678d5_0
- openssl=3.0.14=h5eee18b_0
- pip=24.0=py310h06a4308_0
- python=3.10.14=h955ad1f_1
- readline=8.2=h5eee18b_0
- setuptools=69.5.1=py310h06a4308_0
- sqlite=3.45.3=h5eee18b_0
- tk=8.6.14=h39e8969_0
- wheel=0.43.0=py310h06a4308_0
- xz=5.4.6=h5eee18b_1
- zlib=1.2.13=h5eee18b_1
- pip:
- absl-py==2.1.0
- aenum==3.1.15
- aiofiles==24.1.0
- aiohttp==3.9.5
- aioresponses==0.7.6
- aiosignal==1.3.1
- annotated-types==0.7.0
- anyio==4.4.0
- appdirs==1.4.4
- apscheduler==3.10.1
- asgiref==3.8.1
- asttokens==2.4.1
- async-timeout==4.0.3
- attrs==23.2.0
- awscli==1.29.54
- backoff==2.2.1
- bentoml==1.2.16
- blendmodes==2024.1
- boto3==1.28.23
- botocore==1.31.54
- build==1.2.1
- cachetools==5.3.3
- cattrs==23.1.2
- certifi==2024.7.4
- cffi==1.16.0
- charset-normalizer==3.3.2
- circus==0.18.0
- click==8.1.7
- click-option-group==0.5.6
- cloudpickle==3.0.0
- cmake==3.30.0
- colorama==0.4.4
- coloredlogs==15.0.1
- comm==0.2.2
- contourpy==1.2.1
- coverage==7.5.1
- cryptography==42.0.8
- cycler==0.12.1
- cython==3.0.0
- dataclasses-json==0.6.7
- decorator==5.1.1
- deepmerge==1.1.1
- defusedxml==0.7.1
- deprecated==1.2.14
- distro==1.9.0
- docker==6.1.3
- docutils==0.16
- exceptiongroup==1.2.1
- executing==2.0.1
- fastapi==0.110.3
- filelock==3.15.4
- flatbuffers==24.3.25
- fonttools==4.53.1
- frozenlist==1.4.1
- fs==2.4.16
- fsspec==2024.6.1
- google-api-core==2.19.1
- google-api-python-client==2.136.0
- google-auth==2.31.0
- google-auth-httplib2==0.2.0
- google-cloud-core==2.4.1
- google-cloud-storage==2.17.0
- google-crc32c==1.5.0
- google-resumable-media==2.7.1
- googleapis-common-protos==1.63.2
- gputil==1.4.0
- h11==0.14.0
- httpcore==1.0.5
- httplib2==0.22.0
- httpx==0.27.0
- huggingface-hub==0.23.4
- humanfriendly==10.0
- idna==3.7
- imagehash==4.3.1
- imageio==2.34.2
- importlib-metadata==6.11.0
- inference-gpu==0.12.0
- inflection==0.5.1
- iniconfig==2.0.0
- ipython==8.26.0
- ipywidgets==8.1.3
- jax==0.4.30
- jaxlib==0.4.30
- jedi==0.19.1
- jinja2==3.1.4
- jmespath==1.0.1
- jsonschema==4.22.0
- jsonschema-specifications==2023.12.1
- jupyterlab-widgets==3.0.11
- kiwisolver==1.4.5
- lazy-loader==0.4
- line-profiler==4.1.3
- lit==18.1.8
- markdown-it-py==3.0.0
- markupsafe==2.1.5
- marshmallow==3.21.3
- matplotlib==3.9.1
- matplotlib-inline==0.1.7
- mdurl==0.1.2
- mediapipe==0.10.14
- ml-dtypes==0.4.0
- mpmath==1.3.0
- multidict==6.0.5
- mypy-extensions==1.0.0
- natsort==8.4.0
- networkx==3.3
- numpy==1.26.4
- nvidia-cublas-cu11==11.10.3.66
- nvidia-cuda-cupti-cu11==11.7.101
- nvidia-cuda-nvrtc-cu11==11.7.99
- nvidia-cuda-runtime-cu11==11.7.99
- nvidia-cudnn-cu11==8.5.0.96
- nvidia-cufft-cu11==10.9.0.58
- nvidia-curand-cu11==10.2.10.91
- nvidia-cusolver-cu11==11.4.0.1
- nvidia-cusparse-cu11==11.7.4.91
- nvidia-ml-py==11.525.150
- nvidia-nccl-cu11==2.14.3
- nvidia-nvtx-cu11==11.7.91
- onnxruntime-gpu==1.15.1
- openai==1.35.10
- opencv-contrib-python==4.10.0.84
- opencv-python==4.8.0.76
- opencv-python-headless==4.10.0.84
- opentelemetry-api==1.20.0
- opentelemetry-instrumentation==0.41b0
- opentelemetry-instrumentation-aiohttp-client==0.41b0
- opentelemetry-instrumentation-asgi==0.41b0
- opentelemetry-sdk==1.20.0
- opentelemetry-semantic-conventions==0.41b0
- opentelemetry-util-http==0.41b0
- opt-einsum==3.3.0
- packaging==24.1
- pandas==2.2.2
- parso==0.8.4
- pathspec==0.12.1
- pendulum==3.0.0
- pexpect==4.9.0
- piexif==1.1.3
- pillow==10.4.0
- pillow-heif==0.14.0
- pip-requirements-parser==32.0.1
- pip-tools==7.4.1
- pluggy==1.5.0
- prettytable==3.10.0
- prometheus-client==0.20.0
- prometheus-fastapi-instrumentator==6.0.0
- prompt-toolkit==3.0.47
- proto-plus==1.24.0
- protobuf==4.25.3
- psutil==6.0.0
- ptyprocess==0.7.0
- pulp==2.8.0
- pure-eval==0.2.2
- py-cpuinfo==9.0.0
- pyasn1==0.6.0
- pyasn1-modules==0.4.0
- pybase64==1.3.2
- pycparser==2.22
- pydantic==2.8.2
- pydantic-core==2.20.1
- pydot==2.0.0
- pyfacer==0.0.4
- pygments==2.18.0
- pyparsing==3.1.2
- pyproject-hooks==1.1.0
- pytest==8.2.2
- pytest-asyncio==0.21.1
- python-dateutil==2.9.0.post0
- python-dotenv==1.0.1
- python-json-logger==2.0.7
- python-multipart==0.0.9
- pytz==2024.1
- pywavelets==1.6.0
- pyyaml==6.0.1
- pyzmq==26.0.3
- redis==5.0.7
- referencing==0.35.1
- requests==2.31.0
- requests-toolbelt==1.0.0
- rich==13.5.2
- rpds-py==0.18.1
- rsa==4.7.2
- s3transfer==0.6.2
- safetensors==0.4.3
- schema==0.7.7
- scikit-image==0.24.0
- scipy==1.14.0
- seaborn==0.13.2
- shapely==2.0.1
- simple-di==0.1.5
- six==1.16.0
- skypilot==0.5.0
- sniffio==1.3.1
- sounddevice==0.4.7
- stack-data==0.6.3
- starlette==0.37.2
- structlog==24.2.0
- supervision==0.21.0
- sympy==1.12.1
- tabulate==0.9.0
- thop==0.1.1-2209072238
- tifffile==2024.7.2
- time-machine==2.14.2
- timm==1.0.3
- tomli==2.0.1
- tomli-w==1.0.0
- torch==2.0.1
- torchaudio==2.0.2
- torchvision==0.15.2
- tornado==6.4.1
- tqdm==4.66.4
- traitlets==5.14.3
- triton==2.0.0
- typer==0.9.0
- typing-extensions==4.12.2
- typing-inspect==0.9.0
- tzdata==2024.1
- tzlocal==5.2
- ultralytics==8.2.18
- uritemplate==4.1.1
- urllib3==1.26.19
- uvicorn==0.30.1
- validators==0.30.0
- watchfiles==0.22.0
- wcwidth==0.2.13
- websocket-client==1.8.0
- widgetsnbextension==4.0.11
- wrapt==1.16.0
- yarl==1.9.4
- zipp==3.19.2
- zxing-cpp==2.2.0
prefix: /root/miniconda3/envs/firm
pip_packages
absl-py==2.1.0
aenum==3.1.15
aiofiles==24.1.0
aiohttp==3.9.5
aioresponses==0.7.6
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
appdirs==1.4.4
APScheduler==3.10.1
asgiref==3.8.1
asttokens==2.4.1
async-timeout==4.0.3
attrs==23.2.0
awscli==1.29.54
backoff==2.2.1
bentoml==1.2.16
blendmodes==2024.1
boto3==1.28.23
botocore==1.31.54
build==1.2.1
cachetools==5.3.3
cattrs==23.1.2
certifi==2024.7.4
cffi==1.16.0
charset-normalizer==3.3.2
circus==0.18.0
click==8.1.7
click-option-group==0.5.6
cloudpickle==3.0.0
cmake==3.30.0
colorama==0.4.4
coloredlogs==15.0.1
comm==0.2.2
contourpy==1.2.1
coverage==7.5.1
cryptography==42.0.8
cycler==0.12.1
Cython==3.0.0
dataclasses-json==0.6.7
decorator==5.1.1
deepmerge==1.1.1
defusedxml==0.7.1
Deprecated==1.2.14
distro==1.9.0
docker==6.1.3
docutils==0.16
exceptiongroup==1.2.1
executing==2.0.1
fastapi==0.110.3
filelock==3.15.4
flatbuffers==24.3.25
fonttools==4.53.1
frozenlist==1.4.1
fs==2.4.16
fsspec==2024.6.1
google-api-core==2.19.1
google-api-python-client==2.136.0
google-auth==2.31.0
google-auth-httplib2==0.2.0
google-cloud-core==2.4.1
google-cloud-storage==2.17.0
google-crc32c==1.5.0
google-resumable-media==2.7.1
googleapis-common-protos==1.63.2
GPUtil==1.4.0
h11==0.14.0
httpcore==1.0.5
httplib2==0.22.0
httpx==0.27.0
huggingface-hub==0.23.4
humanfriendly==10.0
idna==3.7
ImageHash==4.3.1
imageio==2.34.2
importlib-metadata==6.11.0
inference-gpu==0.12.0
inflection==0.5.1
iniconfig==2.0.0
ipython==8.26.0
ipywidgets==8.1.3
jax==0.4.30
jaxlib==0.4.30
jedi==0.19.1
Jinja2==3.1.4
jmespath==1.0.1
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
jupyterlab_widgets==3.0.11
kiwisolver==1.4.5
lazy_loader==0.4
line_profiler==4.1.3
lit==18.1.8
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.1
matplotlib-inline==0.1.7
mdurl==0.1.2
mediapipe==0.10.14
ml-dtypes==0.4.0
mpmath==1.3.0
multidict==6.0.5
mypy-extensions==1.0.0
natsort==8.4.0
networkx==3.3
numpy==1.26.4
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-ml-py==11.525.150
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
onnxruntime-gpu==1.15.1
openai==1.35.10
opencv-contrib-python==4.10.0.84
opencv-python==4.8.0.76
opencv-python-headless==4.10.0.84
opentelemetry-api==1.20.0
opentelemetry-instrumentation==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.41b0
opentelemetry-instrumentation-asgi==0.41b0
opentelemetry-sdk==1.20.0
opentelemetry-semantic-conventions==0.41b0
opentelemetry-util-http==0.41b0
opt-einsum==3.3.0
packaging==24.1
pandas==2.2.2
parso==0.8.4
pathspec==0.12.1
pendulum==3.0.0
pexpect==4.9.0
piexif==1.1.3
pillow==10.4.0
pillow-heif==0.14.0
pip-requirements-parser==32.0.1
pip-tools==7.4.1
pluggy==1.5.0
prettytable==3.10.0
prometheus-fastapi-instrumentator==6.0.0
prometheus_client==0.20.0
prompt_toolkit==3.0.47
proto-plus==1.24.0
protobuf==4.25.3
psutil==6.0.0
ptyprocess==0.7.0
PuLP==2.8.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
pyasn1==0.6.0
pyasn1_modules==0.4.0
pybase64==1.3.2
pycparser==2.22
pydantic==2.8.2
pydantic_core==2.20.1
pydot==2.0.0
pyfacer==0.0.4
Pygments==2.18.0
pyparsing==3.1.2
pyproject_hooks==1.1.0
pytest==8.2.2
pytest-asyncio==0.21.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
python-multipart==0.0.9
pytz==2024.1
PyWavelets==1.6.0
PyYAML==6.0.1
pyzmq==26.0.3
redis==5.0.7
referencing==0.35.1
requests==2.31.0
requests-toolbelt==1.0.0
rich==13.5.2
rpds-py==0.18.1
rsa==4.7.2
s3transfer==0.6.2
safetensors==0.4.3
schema==0.7.7
scikit-image==0.24.0
scipy==1.14.0
seaborn==0.13.2
shapely==2.0.1
simple-di==0.1.5
six==1.16.0
skypilot==0.5.0
sniffio==1.3.1
sounddevice==0.4.7
stack-data==0.6.3
starlette==0.37.2
structlog==24.2.0
supervision==0.21.0
sympy==1.12.1
tabulate==0.9.0
thop==0.1.1.post2209072238
tifffile==2024.7.2
time-machine==2.14.2
timm==1.0.3
tomli==2.0.1
tomli_w==1.0.0
torch==2.0.1
torchaudio==2.0.2
torchvision==0.15.2
tornado==6.4.1
tqdm==4.66.4
traitlets==5.14.3
triton==2.0.0
typer==0.9.0
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
ultralytics==8.2.18
uritemplate==4.1.1
urllib3==1.26.19
uvicorn==0.30.1
validators==0.30.0
watchfiles==0.22.0
wcwidth==0.2.13
websocket-client==1.8.0
widgetsnbextension==4.0.11
wrapt==1.16.0
yarl==1.9.4
zipp==3.19.2
zxing-cpp==2.2.0