nanoGPT
nanoGPT copied to clipboard
Issue with running prepare.py
I received the following error while running python prepare.py:
`Traceback (most recent call last): File "C:\Users\fresh\AppData\Roaming\Python\Python39\site-packages\datasets\builder.py", line 1570, in _prepare_split_single for key, record in generator: File "C:\Users\fresh.cache\huggingface\modules\datasets_modules\datasets\openwebtext\85b3ae7051d2d72e7c5fdf6dfb462603aaa26e9ed506202bf3a24d261c6c40a1\openwebtext.py", line 85, in _generate_examples with open(filepath, encoding="utf-8") as f: File "C:\Users\fresh\AppData\Roaming\Python\Python39\site-packages\datasets\streaming.py", line 69, in wrapper return function(*args, use_auth_token=use_auth_token, **kwargs) File "C:\Users\fresh\AppData\Roaming\Python\Python39\site-packages\datasets\download\streaming_download_manager.py", line 445, in xopen return open(main_hop, mode, *args, **kwargs) OSError: [Errno 22] Invalid argument: 'C:\Users\fresh\.cache\huggingface\datasets\downloads\extracted\f03a89c11b1133c3973ac7aed71b6be5c62feb33c5ec06cffb06511974f7194e\001 5896-b1054262f7da52a0518521e29c8e352c.txt'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\fresh\Downloads\nanoGPT\data\openwebtext\prepare.py", line 14, in
I ran into some similar issues with prepare.py as well. Dunno if this is your case, I solved it by setting num_proc to 1 in line 11. Hope this helps.
Unfortunately that didn't help, but I appreciate the suggestion.
I also got this error when running on windows, I had to turn off windows defender and re download the dataset to get pass this