BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

fix: memory issue when push large bentos

Open xianml opened this issue 1 year ago • 6 comments

What does this PR address?

supporting limit max memory usage when pushing models

image

bentoml push facebook--opt-2.7b-service:905a4b602cda5c501f1b3a2650a4152680238254  --maxmemory 2

Test case 1: pushing bento google--flan-t5-large-service, model size 2.92 GiB

  • no limit
  1. time consumed: 3min 58s
  2. memory usage: ~ 3GB
  • maxmemory = 1
  1. time consumed:4min 25s
  2. memory usage: <1G

Test case 2: pushing bento google--flan-t5-large-service, model size 12.55 GiB

  • maxmemory = 3 image
  1. time consumed:4min 48s
  2. memory usage: max ~ 4G

Fixes #(issue)

Before submitting:

xianml avatar Sep 25 '23 10:09 xianml

Hello @xianml! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2023-09-26 02:14:35 UTC

pep8speaks avatar Sep 25 '23 10:09 pep8speaks

What are the bugs within the current implementation?

jianshen92 avatar Sep 29 '23 06:09 jianshen92

What are the bugs within the current implementation?

context in https://bentoml-team.slack.com/archives/C02QLC8RB5W/p1695088745009929

as a summary, currently it will take too much memory when do a bentoml push since it use io.BytesIO. So this fix is to use SpooledTemporaryFile instead to cap the memory usage

xianml avatar Oct 18 '23 09:10 xianml

Looks like unit tests are failing, maybe because of the requests change...?

Should we just use a SpooledTemporaryFile for everything?

sauyon avatar Oct 20 '23 14:10 sauyon

@sauyon you need to change the tests patch requests to httpx. Let's open a separately PR to fix the test?

aarnphm avatar Oct 20 '23 17:10 aarnphm

Looks like unit tests are failing, maybe because of the requests change...?

Should we just use a SpooledTemporaryFile for everything?

  • checked the ut failed logs, seems our test case is out of date
  • now, i just use SpooledTemporaryFile for pushing models. Shall we replaced it one by one ? i am not very confident with a big code change.

xianml avatar Oct 24 '23 02:10 xianml