dodrio
dodrio copied to clipboard
a bug for training
My datasets version is 1.4.1 just like yours. My transfomers version is 3.3.1 just like yours. But when I try to run dodrio-data-gen.py with training model, there is a error report that
Using device: cuda Using the latest cached version of the module from /home/exp-10086/.cache/huggingface/modules/datasets_modules/datasets/glue/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad (last modified on Sat Dec 17 14:32:47 2022) since it couldn't be found locally at glue/glue.py or remotely (ConnectionError). Reusing dataset glue (/home/exp-10086/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
- Training Model...
Using the latest cached version of the module from /home/exp-10086/.cache/huggingface/modules/datasets_modules/datasets/glue/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad (last modified on Sat Dec 17 14:32:47 2022) since it couldn't be found locally at glue/glue.py or remotely (ConnectionError).
Reusing dataset glue (/home/exp-10086/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
Traceback (most recent call last):
File "./data-generation/dodrio-data-gen.py", line 889, in
dataset_vali = load_dataset('glue', 'sst2', split='train+[3%:6%]') File "/home/exp-10086/miniconda3/envs/zebrapose/lib/python3.8/site-packages/datasets/load.py", line 750, in load_dataset ds = builder_instance.as_dataset(split=split, ignore_verifications=ignore_verifications, in_memory=keep_in_memory) File "/home/exp-10086/miniconda3/envs/zebrapose/lib/python3.8/site-packages/datasets/builder.py", line 738, in as_dataset datasets = utils.map_nested( File "/home/exp-10086/miniconda3/envs/zebrapose/lib/python3.8/site-packages/datasets/utils/py_utils.py", line 195, in map_nested return function(data_struct) File "/home/exp-10086/miniconda3/envs/zebrapose/lib/python3.8/site-packages/datasets/builder.py", line 758, in _build_single_dataset split = Split(split) File "/home/exp-10086/miniconda3/envs/zebrapose/lib/python3.8/site-packages/datasets/splits.py", line 423, in new return NamedSplit(name) File "/home/exp-10086/miniconda3/envs/zebrapose/lib/python3.8/site-packages/datasets/splits.py", line 355, in init raise ValueError(f"Split name should match '{_split_re}'' but got '{split_name}'.") ValueError: Split name should match '^\w+(.\w+)*$'' but got ''.
I think it means the function has a wrong parameter. How to fix this bug? Thank you.
Or, would you plz put a link on your sst2.pt file?