llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

the error of streaming

Open sysusicily opened this issue 2 years ago • 0 comments

I followed the example in the readme file but encountered an error. How can I solve it?

(mpt) root@autodl-container-b369119e00-b5dabb5c:~/autodl-tmp/llm-foundry/scripts/train# composer train.py yamls/pretrain/mpt-125m.yaml train_loader.dataset.split=train_small eval_loader.dataset.split=val_small

Initializing model... cfg.n_params=1.25e+08 Building train loader... Traceback (most recent call last): File "/root/autodl-tmp/llm-foundry/scripts/train/train.py", line 254, in main(cfg) File "/root/autodl-tmp/llm-foundry/scripts/train/train.py", line 150, in main train_loader = build_dataloader( File "/root/autodl-tmp/llm-foundry/scripts/train/train.py", line 72, in build_dataloader return build_text_dataloader( File "/root/autodl-tmp/llm-foundry/llmfoundry/data/text_data.py", line 253, in build_text_dataloader dataset = StreamingTextDataset( File "/root/autodl-tmp/llm-foundry/llmfoundry/data/text_data.py", line 110, in init super().init( File "/root/miniconda3/envs/mpt/lib/python3.10/site-packages/streaming/base/dataset.py", line 325, in init self._shm_prefix, self._locals_shm = get_shm_prefix(my_locals, world) File "/root/miniconda3/envs/mpt/lib/python3.10/site-packages/streaming/base/shared.py", line 340, in get_shm_prefix raise ValueError(f'Reused local directory: {sorted(my_locals_set)} vs ' + ValueError: Reused local directory: ['/root/autodl-tmp/llm-foundry/scripts/train/my-copy-c4/train_small'] vs ['/root/autodl-tmp/llm-foundry/scripts/train/my-copy-c4/train_small']. Provide a different one. ERROR:composer.cli.launcher:Rank 0 crashed with exit code 1. Waiting up to 30 seconds for all training processes to terminate. Press Ctrl-C to exit immediately. Global rank 0 (PID 7309) exited with code 1 ERROR:composer.cli.launcher:Global rank 0 (PID 7309) exited with code 1

sysusicily avatar May 16 '23 07:05 sysusicily