bug: Win 11, Cuda, bento serve. Api service significantly slows SD generation.

Open pr1ntr opened this issue 2 years ago • 1 comments

Describe the bug

SD generation is incredibly slow running pytorch 1.13.1 + cuda 11.7 when I run Run BENTOML_CONFIG=configuration.yaml PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 bentoml serve service:svc --production locally.`

When I submit the request through postman, while its processing the request its like 100x slower than if I just cancel the request. Once the request is cancelled the generation time goes from 10min to about 10-20 seconds.

I can confirm that self.device = "cuda"

To reproduce

BENTOML_CONFIG=configuration.yaml PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 bentoml serve service:svc --production
Submit request to txt2img through postman
observe that it takes a looong time
cancel the request
sd generation goes much faster.

I put image.save('./img.jpg') before the response so that I can test the image and it works.

Expected behavior

SD Should generate in less than 15 seconds for this hardware.

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.0.13 python: 3.10.5 platform: Windows-10-10.0.22621-SP0 is_window_admin: False

pip_packages

accelerate==0.15.0
aiohttp==3.8.3
aiosignal==1.3.1
anyio==3.6.2
appdirs==1.4.4
asgiref==3.6.0
async-timeout==4.0.2
attrs==22.2.0
backoff==2.2.1
bentoml==1.0.13
build==0.10.0
cattrs==22.2.0
certifi==2022.12.7
charset-normalizer==2.1.1
circus==0.18.0
click==8.1.3
click-option-group==0.5.5
cloudpickle==2.2.1
colorama==0.4.6
contextlib2==21.6.0
deepmerge==1.1.0
Deprecated==1.2.13
diffusers==0.11.1
exceptiongroup==1.1.0
fastapi==0.89.1
filelock==3.9.0
frozenlist==1.3.3
fs==2.4.16
ftfy==6.1.1
googleapis-common-protos==1.58.0
h11==0.14.0
huggingface-hub==0.11.1
idna==3.4
importlib-metadata==6.0.0
Jinja2==3.1.2
markdown-it-py==2.1.0
MarkupSafe==2.1.2
mdurl==0.1.2
multidict==6.0.4
numpy==1.24.1
opentelemetry-api==1.14.0
opentelemetry-exporter-otlp-proto-http==1.14.0
opentelemetry-instrumentation==0.35b0
opentelemetry-instrumentation-aiohttp-client==0.35b0
opentelemetry-instrumentation-asgi==0.35b0
opentelemetry-proto==1.14.0
opentelemetry-sdk==1.14.0
opentelemetry-semantic-conventions==0.35b0
opentelemetry-util-http==0.35b0
packaging==21.3
pathspec==0.10.3
Pillow==9.4.0
pip-requirements-parser==32.0.1
pip-tools==6.12.1
prometheus-client==0.15.0
protobuf==3.20.3
psutil==5.9.4
pydantic==1.10.4
Pygments==2.14.0
pynvml==11.4.1
pyparsing==3.0.9
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.4
python-multipart==0.0.5
PyYAML==6.0
pyzmq==25.0.0
regex==2022.10.31
requests==2.28.2
rich==13.2.0
schema==0.7.5
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.22.0
tokenizers==0.13.2
tomli==2.0.1
torch==1.13.1+cu117
torchaudio==0.13.1+cu117
torchvision==0.14.1+cu117
tornado==6.2
tqdm==4.64.1
transformers==4.25.1
typing_extensions==4.4.0
urllib3==1.26.14
uvicorn==0.20.0
watchfiles==0.18.1
wcwidth==0.2.6
wrapt==1.14.1
yarl==1.8.2
zipp==3.11.0

Jan 20 '23 18:01 pr1ntr

Can you try it on the latest version of BentoML? Thanks

Jul 11 '24 10:07 frostming