BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

fix: memory issue when push large bentos

Open xianml opened this issue 10 months ago • 6 comments

What does this PR address?

supporting limit max memory usage when pushing models

image

bentoml push facebook--opt-2.7b-service:905a4b602cda5c501f1b3a2650a4152680238254  --maxmemory 2

Test case 1: pushing bento google--flan-t5-large-service, model size 2.92 GiB

  • no limit
  1. time consumed: 3min 58s
  2. memory usage: ~ 3GB
  • maxmemory = 1
  1. time consumed:4min 25s
  2. memory usage: <1G

Test case 2: pushing bento google--flan-t5-large-service, model size 12.55 GiB

  • maxmemory = 3 image
  1. time consumed:4min 48s
  2. memory usage: max ~ 4G

Fixes #(issue)

Before submitting:

xianml avatar Sep 25 '23 10:09 xianml