transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Unable to Infer on Bloom Model-2b5 using Deepspeed

Open Ravisankar13 opened this issue 3 years ago • 2 comments

System Info

I was able to load the Bloom-2b5 Model onto my Colab notebook for text generation(Inference). When I use Deepspeed to load the model and try to inference the memory is not sufficient. I don't understand because with the help of deepspeed i should be able to load the larger model or atleast the same model.

Who can help?

No response

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

  1. Load the 2b5 model- https://huggingface.co/bigscience/bloom-2b5
  2. Infer with and without deepspeed - https://huggingface.co/docs/transformers/main_classes/deepspeed

Expected behavior

Cuda error: Insufficient memory

Ravisankar13 avatar Aug 03 '22 13:08 Ravisankar13

Hey @Ravisankar13, could you please provide the script you have used, the deepspeed config, as well as the full stacktrace? It will be hard to help you with so little information.

cc @stas00

LysandreJik avatar Aug 09 '22 07:08 LysandreJik

@Ravisankar13, please see the work-in-progress here: https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/308

You have a variety of different working solutions there.

we will soon move those here.

stas00 avatar Aug 09 '22 16:08 stas00

Thanks for your response. Let me try them and get back to you

Ravisankar13 avatar Aug 16 '22 12:08 Ravisankar13

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 09 '22 15:09 github-actions[bot]