infinity icon indicating copy to clipboard operation
infinity copied to clipboard

How to limit memory usage?

Open thaiminhpv opened this issue 8 months ago • 1 comments

When I run

port=3000
model1=Salesforce/SFR-Embedding-Code-2B_R
volume=$PWD/data
docker run -it --gpus device=0 \
 -v $volume:/app/.cache \
 -p $port:$port \
 michaelf34/infinity:latest \
 v2 \
 --model-id $model1 \
 --port $port \
 --model-warmup \
 --batch-size 4

I got this warning

accelerate.utils.modeling INFO: We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).

How can I set max_memory for e.g. using only 30% of GPU 0?

thaiminhpv avatar Apr 19 '25 08:04 thaiminhpv