codellama Please help: failed to create process

Hi,

Apologies if the solution is obvious but I'm new to this. When running the example infilling script: torchrun --nproc_per_node 1 example_infilling.py --ckpt_dir CodeLlama-7b/ --tokenizer_path CodeLlama-7b/tokenizer.model --max_seq_len 192 --max_batch_size 4

I get the following error with no additional details: failed to create process

In fact anything with torchrun returns the same error.

I tried:

Different versions of CUDA and CPU-only Pytorch
Checked that tokenizer_path is correct and nproc_per_node is set to the right MP value
A comment on another post suggested using python -m torch.distributed.run instead of torchrun. I get a different error when I do this (happy to give more info)

Any help would be greatly appreciated!

Oct 17 '23 16:10 dv347

I'm facing the same issue. Tried with python 3.8 and 3.7. I'm just an enthusiastic person trying to play with an AI , I have no experience with Python but I would love a tip to solve this. Trying researching and even ChatGPT but no success so far.

I created my Conda env with the following conda create -conda create --name llama-code-p37 python=3.8 pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge. Am I missing something?

Oct 21 '23 11:10 antonioanerao

Sto affrontando lo stesso problema. Provato con Python 3.8 e 3.7. Sono solo una persona entusiasta che cerca di giocare con un'intelligenza artificiale, non ho esperienza con Python ma mi piacerebbe un suggerimento per risolvere questo problema. Ho provato a fare ricerche e persino a ChatGPT, ma finora senza successo.

Ho creato il mio Conda env con il seguente file conda create -conda create --name llama-code-p37 python=3.8 pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge. Mi sto perdendo qualcosa?

Serve python3.10 io uso python3.12 deve essere superiore a pyton10 prova a vedere se risolvi il problema

Feb 04 '24 08:02 allelive

Solved from stackoverflow! you can try

https://stackoverflow.com/questions/77425569/llama2-running-pytorch-produces-a-failed-to-create-process

Jun 21 '24 06:06 hiehie1234