BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

bug: openllm build creates 3 copies of the model weights in different places

Open jhostetler opened this issue 1 year ago • 1 comments

Describe the bug

When running openllm build with BENTOML_HOME=/foobar (for example):

  1. First, the model weights are downloaded to a directory under $HOME (in my case, under /root because this is running in a Docker container in a Kubernetes pod).
  2. Second, the weights are copied to a directory under /tmp
  3. Finally, the weights are copied again to a directory under BENTOML_HOME (which is where we wanted them)

I'm guessing at least one of these copies is unnecessary. Ideally, the files would end up under BENTOML_HOME directly without any intermediate copies, but I'm not sure if that's feasible.

In any case, it would be helpful to document that the build process requires enough storage for the full model at all three locations. When building inside a Kubernetes pod, for example, one must mount volumes at both /root and /tmp that are big enough to hold the model, else there will be an error saying the pod has exhausted its ephemeral-storage.

To reproduce

Example Python code:

import os
import subprocess
cmd = ["openllm", "build", "falcon", "--model-id", "tiiuae/falcon-7b"]
env = os.environ.copy()
env.update({"BENTOML_HOME": "/somewhere"})
subprocess.run(cmd, env=env, check=True)

I monitored disk usage with a background process that ran the following shell command every second:

for d in /*; do du -sh $d; done

Logs

No response

Environment

bentoml: 1.1.6

System information (Optional)

Running inside Docker container in Kubernetes pod

jhostetler avatar Sep 25 '23 03:09 jhostetler

This is intended, as we want the build from bentoml to be atomic. I will probably transfer this to BentoML and we can track it there.

aarnphm avatar Nov 07 '23 22:11 aarnphm