llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

The error of test the Dataloader

Open sysusicily opened this issue 1 year ago • 2 comments

when I run "python ../../llmfoundry/data/text_data.py --local_path ./my-copy-c4 --split val_small", I get the error

(mpt-env) root@autodl-container-645911b4fa-161dd8b6:~/autodl-tmp/llm-foundry/scripts/train# python ../../llmfoundry/data/text_data.py --local_path ./my-copy-c4 --split val_small

Traceback (most recent call last): File "../../llmfoundry/data/text_data.py", line 308, in from llmfoundry.utils.builders import build_tokenizer File "/root/autodl-tmp/llm-foundry/llmfoundry/init.py", line 8, in from llmfoundry.data import (ConcatTokensDataset, File "/root/autodl-tmp/llm-foundry/llmfoundry/data/init.py", line 4, in from llmfoundry.data.datasets import ConcatTokensDataset, NoConcatDataset File "/root/autodl-tmp/llm-foundry/llmfoundry/data/datasets.py", line 9, in import datasets as hf_datasets File "/root/autodl-tmp/llm-foundry/llmfoundry/data/datasets.py", line 15, in class NoConcatDataset(IterableDataset): File "/root/autodl-tmp/llm-foundry/llmfoundry/data/datasets.py", line 22, in NoConcatDataset hf_datasets.Dataset]): AttributeError: partially initialized module 'datasets' has no attribute 'Dataset' (most likely due to a circular import)

sysusicily avatar May 15 '23 07:05 sysusicily

I ran into this as well. You can fix it by just getting rid of the typing hints. You'll probably run into more bugs after that though. Let me know how that goes.

tginart avatar May 15 '23 07:05 tginart

I'll take a look at this. Thanks for bringing it to our attention.

codestar12 avatar May 15 '23 18:05 codestar12

Hello, we pushed a fix in #175 . Closing for now, please re-open if you run into the same issue again!

hanlint avatar May 19 '23 19:05 hanlint