eda_nlp
eda_nlp copied to clipboard
ValueError: empty range for randrange() (0,0, 0)
I have processed the data according to the data format you said,Here are my running scripts and errors
python code/augment.py --input=train_50w.en --output=train_50w._augmented.txt --num_aug=1 --alpha_sr=0.05 --alpha_rd=0.05 --alpha_ri=0 --alpha_rs=0.05
Traceback (most recent call last): File "code/augment.py", line 75, in
gen_eda(args.input, output, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, alpha_rd=alpha_rd, num_aug=num_aug) File "code/augment.py", line 64, in gen_eda aug_sentences = eda(sentence, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, p_rd=alpha_rd, num_aug=num_aug) File "/home/tool/eda_nlp-master/code/eda.py", line 201, in eda a_words = random_swap(words, n_rs) File "/home/tool/eda_nlp-master/code/eda.py", line 130, in random_swap new_words = swap_word(new_words) File "/home/tool/eda_nlp-master/code/eda.py", line 134, in swap_word random_idx_1 = random.randint(0, len(new_words)-1) File "/home/miniconda3/envs/eda/lib/python3.6/random.py", line 221, in randint return self.randrange(a, b+1) File "/home/miniconda3/envs/eda/lib/python3.6/random.py", line 199, in randrange raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width)) ValueError: empty range for randrange() (0,0, 0)
same issue here:
python code/augment.py --input=data/train_original.txt --num_aug=15 --alpha_sr=0.1
Traceback (most recent call last): File "code/augment.py", line 75, in
gen_eda(args.input, output, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, alpha_rd=alpha_rd, num_aug=num_aug) File "code/augment.py", line 64, in gen_eda aug_sentences = eda(sentence, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, p_rd=alpha_rd, num_aug=num_aug) File "/Users/bernardogarcia/GitHub/eda_nlp/code/eda.py", line 194, in eda a_words = random_insertion(words, n_ri) File "/Users/bernardogarcia/GitHub/eda_nlp/code/eda.py", line 153, in random_insertion add_word(new_words) File "/Users/bernardogarcia/GitHub/eda_nlp/code/eda.py", line 160, in add_word random_word = new_words[random.randint(0, len(new_words)-1)] File "/Users/bernardogarcia/opt/anaconda3/envs/nlp-news_filter/lib/python3.7/random.py", line 222, in randint return self.randrange(a, b+1) File "/Users/bernardogarcia/opt/anaconda3/envs/nlp-news_filter/lib/python3.7/random.py", line 200, in randrange raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width)) ValueError: empty range for randrange() (0,0, 0)
In code/eda.py, the main function eda starts with below in line 175
sentence = get_only_chars(sentence)
get_only_chars function performs preprocessing to remove non-alphabetic characters from text. Therefore, if you input text data consisting of only non-alphabetic characters, len(words)-1 becomes -1 in the code below in line 117 and etc. and an error occurs.
rand_int = random.randint(0, len(words)-1)
If you input text data that consists only of non-alphabetic characters, you can avoid this error by modifying the get_only_chars function in line 45, so that the data is excluded from removal as follows.
if char in 'qwertyuiopasdfghjklzxcvbnm123456789(). ':
i have same problem,big probability is your data problem. if you are sentence is null or particular token ,like "------". check your data.
I also have this problem, could this problem refers to dataset? because my dataset has three columns but the dataset in this repository has two columns. would you please help me.