transformers-bloom-inference
transformers-bloom-inference copied to clipboard
Fast Inference Solutions for BLOOM
I run this script `deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8` and it gets stuck just like in the picture. Log: ``` (base) raihanafiandi@instance-1:~/playground/transformers-bloom-inference$ deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name...
Hi, I now employ the deepspeed framework to speed up the inference of BLOOM 7.1B, as shown below: `deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloom-7b1` But instead I got the following...
Add some minor config changes to support int4 inference through DeepSpeed-Inference. The Int4 support will be added to DeepSpeed through this [PR](https://github.com/microsoft/DeepSpeed/pull/2526). cc: @stas00
Hey! I'm using a custom version of this repo to run BLOOM-175B with DeepSpeed and it works great, thank you for this! I was thinking of exploring using large models...