BentoML fix: memory issue when push large bentos

fix: memory issue when push large bentos

Open xianml opened this issue 10 months ago • 6 comments

supporting limit max memory usage when pushing models

bentoml push facebook--opt-2.7b-service:905a4b602cda5c501f1b3a2650a4152680238254  --maxmemory 2

Test case 1: pushing bento google--flan-t5-large-service, model size 2.92 GiB

Test case 2: pushing bento google--flan-t5-large-service, model size 12.55 GiB

Fixes #(issue)

[x] Does the Pull Request follow Conventional Commits specification naming? Here are GitHub's guide on how to create a pull request.
[x] Does the code follow BentoML's code style, pre-commit run -a script has passed (instructions)?
[x] Did you read through contribution guidelines and follow development guidelines?
[ ] Did your changes require updates to the documentation? Have you updated those accordingly? Here are documentation guidelines and tips on writting docs.
[ ] Did you write tests to cover your changes?

Sep 25 '23 10:09 xianml