transformers-bloom-inference icon indicating copy to clipboard operation
transformers-bloom-inference copied to clipboard

Fast Inference Solutions for BLOOM

Results 24 transformers-bloom-inference issues
Sort by recently updated
recently updated
newest added

I run this script `deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8` and it gets stuck just like in the picture. Log: ``` (base) raihanafiandi@instance-1:~/playground/transformers-bloom-inference$ deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name...

Hi, I now employ the deepspeed framework to speed up the inference of BLOOM 7.1B, as shown below: `deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloom-7b1` But instead I got the following...

Add some minor config changes to support int4 inference through DeepSpeed-Inference. The Int4 support will be added to DeepSpeed through this [PR](https://github.com/microsoft/DeepSpeed/pull/2526). cc: @stas00

Hey! I'm using a custom version of this repo to run BLOOM-175B with DeepSpeed and it works great, thank you for this! I was thinking of exploring using large models...