parallelformers icon indicating copy to clipboard operation
parallelformers copied to clipboard

Bus error in parallelformers 1.2.7 for OPT model

Open sindhuvahinis opened this issue 2 years ago • 1 comments

How to reproduce

from transformers import AutoModelForCausalLM, AutoTokenizer

if __name__ == '__main__':
    model_name = 'facebook/opt-30b'
    model = AutoModelForCausalLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    from parallelformers import parallelize
    
    parallelize(model, num_gpus=8, fp16=True)

This error was thrown at parallelize method :

Bus error (core dumped)

We tried with parallelformers version 1.2.6 and transformers version 4.21.11, this error was not thrown. This error is only happening with the parallelformers version 1.2.7 and transformers version 4.21.11.

Environment

  • OS : Ubuntu
  • Python version : 3.8.13
  • Transformers version : 4.21.11
  • Parallelformers version : 1.2.6
  • Whether to use Docker: yes
  • Misc.:

sindhuvahinis avatar Aug 24 '22 19:08 sindhuvahinis

Same problem under:

parallelformers 1.2.7 transformers 4.24.0 python 3.8.16 pytorch 2.0.0 pytorch-cuda 11.7

The model works on 2 GPUs without parallelformers. Trying to use >2 GPUs with parallelformers.

agabaldon avatar May 11 '23 16:05 agabaldon