transformers icon indicating copy to clipboard operation
transformers copied to clipboard

llama model cannot run with accelerate setting

Open TeddLi opened this issue 1 year ago • 2 comments

System Info

transformer version 4.28.0.dev0 Error `loading file tokenizer_config.json loading weights file ./llama1/pytorch_model.bin Generate config GenerationConfig { "_from_model_config": true, "bos_token_id": 0, "eos_token_id": 1, "pad_token_id": 1, "transformers_version": "4.28.0.dev0" }

[15:28:22] WARNING Sending process 854275 closing signal SIGTERM api.py:699 WARNING Sending process 854276 closing signal SIGTERM api.py:699 WARNING Sending process 854277 closing signal SIGTERM api.py:699 WARNING Sending process 854279 closing signal SIGTERM api.py:699 WARNING Sending process 854280 closing signal SIGTERM api.py:699 WARNING Sending process 854281 closing signal SIGTERM api.py:699 WARNING Sending process 854282 closing signal SIGTERM api.py:699 [15:28:25] ERROR failed (exitcode: -9) local_rank: 3 (pid: 854278) of binary: /usr/bin/python3 `

Who can help?

No response

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

We tried to train Pile with accelerate 8 GPUs setting

Expected behavior

I would expect it load succesfully

TeddLi avatar Mar 24 '23 20:03 TeddLi

Please follow the template of the issues as there is nothing anyone can do to help with so little information.

sgugger avatar Mar 24 '23 20:03 sgugger

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 24 '23 15:04 github-actions[bot]