BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

bug: Async Return Latency Issues with BentoML Image IO API

Open takhyun12 opened this issue 1 year ago • 0 comments

Describe the bug

Hello,

I'm facing an issue with BentoML API serving where significant delays occur during the async return of images.

Here’s the simplified code:

from io import BytesIO
import time
from bentoml.io import Multipart, File, Image
from PIL import Image as PILImage


@service.api(
    input=Multipart(data=File()),
    output=Image(mime_type="image/png"),
    route="/inpaint/test/wrinkles",
)
async def api_test(data: BytesIO):
    start_time = time.time()
    image = PILImage.open(BytesIO(data.read())).convert("RGB")
    print("Image loaded in", time.time() - start_time, "seconds")
    return image

Loading the image is efficient (0.08-0.1 seconds), but returning it asynchronously incurs a delay of up to 10 seconds. Attempting to resolve this with a runner leads to a format error:

Traceback (most recent call last):
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/server/http_app.py", line 334, in api_func
    output = await api.func(**input_data)
  File "/root/snowflake/backend/python/firm/services/inpaint/service.py", line 215, in api
    return await post_process_runner.forward.async_run(source=image, target=output_image)
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 56, in async_run
    return await self.runner._runner_handle.async_run_method(self, *args, **kwargs)
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 201, in async_run_method
    payload_params = Params[Payload](*args, **kwargs).map(
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/utils.py", line 65, in map
    kwargs = {k: function(v) for k, v in self.kwargs.items()}
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/utils.py", line 65, in <dictcomp>
    kwargs = {k: function(v) for k, v in self.kwargs.items()}
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/container.py", line 700, in to_payload
    return container_cls.to_payload(batch, batch_dim)
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/bentoml/_internal/runner/container.py", line 490, in to_payload
    batch.save(buffer, format=batch.format)
  File "/root/miniconda3/envs/firm/lib/python3.10/site-packages/PIL/Image.py", line 2546, in save
    raise ValueError(msg) from e
ValueError: unknown file extension:

How can I properly handle async returns with bentoml.io.Image() to avoid these delays?

Thank you for your assistance.

To reproduce

No response

Expected behavior

No response

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.2.16 python: 3.10.14 platform: Linux-4.18.0-425.19.2.el8_7.x86_64-x86_64-with-glibc2.31 uid_gid: 0:0 conda: 23.5.0 in_conda_env: True

conda_packages
name: firm
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - bzip2=1.0.8=h5eee18b_6
  - ca-certificates=2024.3.11=h06a4308_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.4.4=h6a678d5_1
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libstdcxx-ng=11.2.0=h1234567_1
  - libuuid=1.41.5=h5eee18b_0
  - ncurses=6.4=h6a678d5_0
  - openssl=3.0.14=h5eee18b_0
  - pip=24.0=py310h06a4308_0
  - python=3.10.14=h955ad1f_1
  - readline=8.2=h5eee18b_0
  - setuptools=69.5.1=py310h06a4308_0
  - sqlite=3.45.3=h5eee18b_0
  - tk=8.6.14=h39e8969_0
  - wheel=0.43.0=py310h06a4308_0
  - xz=5.4.6=h5eee18b_1
  - zlib=1.2.13=h5eee18b_1
  - pip:
      - absl-py==2.1.0
      - aenum==3.1.15
      - aiofiles==24.1.0
      - aiohttp==3.9.5
      - aioresponses==0.7.6
      - aiosignal==1.3.1
      - annotated-types==0.7.0
      - anyio==4.4.0
      - appdirs==1.4.4
      - apscheduler==3.10.1
      - asgiref==3.8.1
      - asttokens==2.4.1
      - async-timeout==4.0.3
      - attrs==23.2.0
      - awscli==1.29.54
      - backoff==2.2.1
      - bentoml==1.2.16
      - blendmodes==2024.1
      - boto3==1.28.23
      - botocore==1.31.54
      - build==1.2.1
      - cachetools==5.3.3
      - cattrs==23.1.2
      - certifi==2024.7.4
      - cffi==1.16.0
      - charset-normalizer==3.3.2
      - circus==0.18.0
      - click==8.1.7
      - click-option-group==0.5.6
      - cloudpickle==3.0.0
      - cmake==3.30.0
      - colorama==0.4.4
      - coloredlogs==15.0.1
      - comm==0.2.2
      - contourpy==1.2.1
      - coverage==7.5.1
      - cryptography==42.0.8
      - cycler==0.12.1
      - cython==3.0.0
      - dataclasses-json==0.6.7
      - decorator==5.1.1
      - deepmerge==1.1.1
      - defusedxml==0.7.1
      - deprecated==1.2.14
      - distro==1.9.0
      - docker==6.1.3
      - docutils==0.16
      - exceptiongroup==1.2.1
      - executing==2.0.1
      - fastapi==0.110.3
      - filelock==3.15.4
      - flatbuffers==24.3.25
      - fonttools==4.53.1
      - frozenlist==1.4.1
      - fs==2.4.16
      - fsspec==2024.6.1
      - google-api-core==2.19.1
      - google-api-python-client==2.136.0
      - google-auth==2.31.0
      - google-auth-httplib2==0.2.0
      - google-cloud-core==2.4.1
      - google-cloud-storage==2.17.0
      - google-crc32c==1.5.0
      - google-resumable-media==2.7.1
      - googleapis-common-protos==1.63.2
      - gputil==1.4.0
      - h11==0.14.0
      - httpcore==1.0.5
      - httplib2==0.22.0
      - httpx==0.27.0
      - huggingface-hub==0.23.4
      - humanfriendly==10.0
      - idna==3.7
      - imagehash==4.3.1
      - imageio==2.34.2
      - importlib-metadata==6.11.0
      - inference-gpu==0.12.0
      - inflection==0.5.1
      - iniconfig==2.0.0
      - ipython==8.26.0
      - ipywidgets==8.1.3
      - jax==0.4.30
      - jaxlib==0.4.30
      - jedi==0.19.1
      - jinja2==3.1.4
      - jmespath==1.0.1
      - jsonschema==4.22.0
      - jsonschema-specifications==2023.12.1
      - jupyterlab-widgets==3.0.11
      - kiwisolver==1.4.5
      - lazy-loader==0.4
      - line-profiler==4.1.3
      - lit==18.1.8
      - markdown-it-py==3.0.0
      - markupsafe==2.1.5
      - marshmallow==3.21.3
      - matplotlib==3.9.1
      - matplotlib-inline==0.1.7
      - mdurl==0.1.2
      - mediapipe==0.10.14
      - ml-dtypes==0.4.0
      - mpmath==1.3.0
      - multidict==6.0.5
      - mypy-extensions==1.0.0
      - natsort==8.4.0
      - networkx==3.3
      - numpy==1.26.4
      - nvidia-cublas-cu11==11.10.3.66
      - nvidia-cuda-cupti-cu11==11.7.101
      - nvidia-cuda-nvrtc-cu11==11.7.99
      - nvidia-cuda-runtime-cu11==11.7.99
      - nvidia-cudnn-cu11==8.5.0.96
      - nvidia-cufft-cu11==10.9.0.58
      - nvidia-curand-cu11==10.2.10.91
      - nvidia-cusolver-cu11==11.4.0.1
      - nvidia-cusparse-cu11==11.7.4.91
      - nvidia-ml-py==11.525.150
      - nvidia-nccl-cu11==2.14.3
      - nvidia-nvtx-cu11==11.7.91
      - onnxruntime-gpu==1.15.1
      - openai==1.35.10
      - opencv-contrib-python==4.10.0.84
      - opencv-python==4.8.0.76
      - opencv-python-headless==4.10.0.84
      - opentelemetry-api==1.20.0
      - opentelemetry-instrumentation==0.41b0
      - opentelemetry-instrumentation-aiohttp-client==0.41b0
      - opentelemetry-instrumentation-asgi==0.41b0
      - opentelemetry-sdk==1.20.0
      - opentelemetry-semantic-conventions==0.41b0
      - opentelemetry-util-http==0.41b0
      - opt-einsum==3.3.0
      - packaging==24.1
      - pandas==2.2.2
      - parso==0.8.4
      - pathspec==0.12.1
      - pendulum==3.0.0
      - pexpect==4.9.0
      - piexif==1.1.3
      - pillow==10.4.0
      - pillow-heif==0.14.0
      - pip-requirements-parser==32.0.1
      - pip-tools==7.4.1
      - pluggy==1.5.0
      - prettytable==3.10.0
      - prometheus-client==0.20.0
      - prometheus-fastapi-instrumentator==6.0.0
      - prompt-toolkit==3.0.47
      - proto-plus==1.24.0
      - protobuf==4.25.3
      - psutil==6.0.0
      - ptyprocess==0.7.0
      - pulp==2.8.0
      - pure-eval==0.2.2
      - py-cpuinfo==9.0.0
      - pyasn1==0.6.0
      - pyasn1-modules==0.4.0
      - pybase64==1.3.2
      - pycparser==2.22
      - pydantic==2.8.2
      - pydantic-core==2.20.1
      - pydot==2.0.0
      - pyfacer==0.0.4
      - pygments==2.18.0
      - pyparsing==3.1.2
      - pyproject-hooks==1.1.0
      - pytest==8.2.2
      - pytest-asyncio==0.21.1
      - python-dateutil==2.9.0.post0
      - python-dotenv==1.0.1
      - python-json-logger==2.0.7
      - python-multipart==0.0.9
      - pytz==2024.1
      - pywavelets==1.6.0
      - pyyaml==6.0.1
      - pyzmq==26.0.3
      - redis==5.0.7
      - referencing==0.35.1
      - requests==2.31.0
      - requests-toolbelt==1.0.0
      - rich==13.5.2
      - rpds-py==0.18.1
      - rsa==4.7.2
      - s3transfer==0.6.2
      - safetensors==0.4.3
      - schema==0.7.7
      - scikit-image==0.24.0
      - scipy==1.14.0
      - seaborn==0.13.2
      - shapely==2.0.1
      - simple-di==0.1.5
      - six==1.16.0
      - skypilot==0.5.0
      - sniffio==1.3.1
      - sounddevice==0.4.7
      - stack-data==0.6.3
      - starlette==0.37.2
      - structlog==24.2.0
      - supervision==0.21.0
      - sympy==1.12.1
      - tabulate==0.9.0
      - thop==0.1.1-2209072238
      - tifffile==2024.7.2
      - time-machine==2.14.2
      - timm==1.0.3
      - tomli==2.0.1
      - tomli-w==1.0.0
      - torch==2.0.1
      - torchaudio==2.0.2
      - torchvision==0.15.2
      - tornado==6.4.1
      - tqdm==4.66.4
      - traitlets==5.14.3
      - triton==2.0.0
      - typer==0.9.0
      - typing-extensions==4.12.2
      - typing-inspect==0.9.0
      - tzdata==2024.1
      - tzlocal==5.2
      - ultralytics==8.2.18
      - uritemplate==4.1.1
      - urllib3==1.26.19
      - uvicorn==0.30.1
      - validators==0.30.0
      - watchfiles==0.22.0
      - wcwidth==0.2.13
      - websocket-client==1.8.0
      - widgetsnbextension==4.0.11
      - wrapt==1.16.0
      - yarl==1.9.4
      - zipp==3.19.2
      - zxing-cpp==2.2.0
prefix: /root/miniconda3/envs/firm
pip_packages
absl-py==2.1.0
aenum==3.1.15
aiofiles==24.1.0
aiohttp==3.9.5
aioresponses==0.7.6
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
appdirs==1.4.4
APScheduler==3.10.1
asgiref==3.8.1
asttokens==2.4.1
async-timeout==4.0.3
attrs==23.2.0
awscli==1.29.54
backoff==2.2.1
bentoml==1.2.16
blendmodes==2024.1
boto3==1.28.23
botocore==1.31.54
build==1.2.1
cachetools==5.3.3
cattrs==23.1.2
certifi==2024.7.4
cffi==1.16.0
charset-normalizer==3.3.2
circus==0.18.0
click==8.1.7
click-option-group==0.5.6
cloudpickle==3.0.0
cmake==3.30.0
colorama==0.4.4
coloredlogs==15.0.1
comm==0.2.2
contourpy==1.2.1
coverage==7.5.1
cryptography==42.0.8
cycler==0.12.1
Cython==3.0.0
dataclasses-json==0.6.7
decorator==5.1.1
deepmerge==1.1.1
defusedxml==0.7.1
Deprecated==1.2.14
distro==1.9.0
docker==6.1.3
docutils==0.16
exceptiongroup==1.2.1
executing==2.0.1
fastapi==0.110.3
filelock==3.15.4
flatbuffers==24.3.25
fonttools==4.53.1
frozenlist==1.4.1
fs==2.4.16
fsspec==2024.6.1
google-api-core==2.19.1
google-api-python-client==2.136.0
google-auth==2.31.0
google-auth-httplib2==0.2.0
google-cloud-core==2.4.1
google-cloud-storage==2.17.0
google-crc32c==1.5.0
google-resumable-media==2.7.1
googleapis-common-protos==1.63.2
GPUtil==1.4.0
h11==0.14.0
httpcore==1.0.5
httplib2==0.22.0
httpx==0.27.0
huggingface-hub==0.23.4
humanfriendly==10.0
idna==3.7
ImageHash==4.3.1
imageio==2.34.2
importlib-metadata==6.11.0
inference-gpu==0.12.0
inflection==0.5.1
iniconfig==2.0.0
ipython==8.26.0
ipywidgets==8.1.3
jax==0.4.30
jaxlib==0.4.30
jedi==0.19.1
Jinja2==3.1.4
jmespath==1.0.1
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
jupyterlab_widgets==3.0.11
kiwisolver==1.4.5
lazy_loader==0.4
line_profiler==4.1.3
lit==18.1.8
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.1
matplotlib-inline==0.1.7
mdurl==0.1.2
mediapipe==0.10.14
ml-dtypes==0.4.0
mpmath==1.3.0
multidict==6.0.5
mypy-extensions==1.0.0
natsort==8.4.0
networkx==3.3
numpy==1.26.4
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-ml-py==11.525.150
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
onnxruntime-gpu==1.15.1
openai==1.35.10
opencv-contrib-python==4.10.0.84
opencv-python==4.8.0.76
opencv-python-headless==4.10.0.84
opentelemetry-api==1.20.0
opentelemetry-instrumentation==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.41b0
opentelemetry-instrumentation-asgi==0.41b0
opentelemetry-sdk==1.20.0
opentelemetry-semantic-conventions==0.41b0
opentelemetry-util-http==0.41b0
opt-einsum==3.3.0
packaging==24.1
pandas==2.2.2
parso==0.8.4
pathspec==0.12.1
pendulum==3.0.0
pexpect==4.9.0
piexif==1.1.3
pillow==10.4.0
pillow-heif==0.14.0
pip-requirements-parser==32.0.1
pip-tools==7.4.1
pluggy==1.5.0
prettytable==3.10.0
prometheus-fastapi-instrumentator==6.0.0
prometheus_client==0.20.0
prompt_toolkit==3.0.47
proto-plus==1.24.0
protobuf==4.25.3
psutil==6.0.0
ptyprocess==0.7.0
PuLP==2.8.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
pyasn1==0.6.0
pyasn1_modules==0.4.0
pybase64==1.3.2
pycparser==2.22
pydantic==2.8.2
pydantic_core==2.20.1
pydot==2.0.0
pyfacer==0.0.4
Pygments==2.18.0
pyparsing==3.1.2
pyproject_hooks==1.1.0
pytest==8.2.2
pytest-asyncio==0.21.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
python-multipart==0.0.9
pytz==2024.1
PyWavelets==1.6.0
PyYAML==6.0.1
pyzmq==26.0.3
redis==5.0.7
referencing==0.35.1
requests==2.31.0
requests-toolbelt==1.0.0
rich==13.5.2
rpds-py==0.18.1
rsa==4.7.2
s3transfer==0.6.2
safetensors==0.4.3
schema==0.7.7
scikit-image==0.24.0
scipy==1.14.0
seaborn==0.13.2
shapely==2.0.1
simple-di==0.1.5
six==1.16.0
skypilot==0.5.0
sniffio==1.3.1
sounddevice==0.4.7
stack-data==0.6.3
starlette==0.37.2
structlog==24.2.0
supervision==0.21.0
sympy==1.12.1
tabulate==0.9.0
thop==0.1.1.post2209072238
tifffile==2024.7.2
time-machine==2.14.2
timm==1.0.3
tomli==2.0.1
tomli_w==1.0.0
torch==2.0.1
torchaudio==2.0.2
torchvision==0.15.2
tornado==6.4.1
tqdm==4.66.4
traitlets==5.14.3
triton==2.0.0
typer==0.9.0
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
ultralytics==8.2.18
uritemplate==4.1.1
urllib3==1.26.19
uvicorn==0.30.1
validators==0.30.0
watchfiles==0.22.0
wcwidth==0.2.13
websocket-client==1.8.0
widgetsnbextension==4.0.11
wrapt==1.16.0
yarl==1.9.4
zipp==3.19.2
zxing-cpp==2.2.0

takhyun12 avatar Jul 16 '24 08:07 takhyun12