nlpaug
nlpaug copied to clipboard
Data augmentation for NLP
Fasttext with NLPAug Attribure Error 'Word2VecKeyedVectors' object has no attribute 'index_to_key'
Hello, I am trying to use nlpaug to a dataset and I used with BERT/distilBERT perfectly, it is a great way to augment data. However, when I try to use...
Hello! this repository helped me a lot! When using ContextualWordEmbsAug (such as BERT), [MASK] a word in the input text and replace it using Masked Language Model. And I'm wondering...
File "/workspace/system-paper/main_dsscc_hil.py", line 61, in main_dsscc_hil main_copy(p) File "/workspace/system-paper/main_copy_args.py", line 230, in main_copy contextual_augment(p, aug_percentage[p['dataset']]) File "/workspace/system-paper/contextual_augmenter.py", line 49, in contextual_augment aug_txt_2 = aug_2.augment(txt) File "/opt/conda/lib/python3.6/site-packages/nlpaug/base_augmenter.py", line 98, in augment...
When I input the following code: ``` import nlpaug.augmenter.word as naw text = 'The quick brown fox jumps over the lazy dog .' aug = naw.BackTranslationAug( from_model_name='facebook/wmt19.en-de', to_model_name='facebook/wmt19.de-en') aug.augment(text) ```...
I am trying to use a word embeddings based augmenter using the following code: `aug = naw.WordEmbsAug(model_type='glove',model_path='/content/glove.6B.50d.txt', action='substitute')` I loaded the glove embeddings into the Google colab working directory using...
Hello. Reproduce: ```python import nlpaug.augmenter.char as nac print(nac.KeyboardAug().augment("Hello . Test ? Testing ! And this : Not this + ")) ``` This will output the following: ```yaml output: "Hello. TeZ5?...
Hi, I encounter the following error when I try to supply a batch to `nas.ContextualWordEmbsForSentenceAug`. After checking other [post](https://github.com/makcedward/nlpaug/issues/146), I expected that supplying a list of batch_size will work but...
Hi, I tried out `nas.LambadaAug` augmentation method. I had followed the script given in `https://github.com/makcedward/nlpaug/blob/master/scripts/train_lambada.sh` for the datasets of my task. The dataset was based on Twitter corpus collection and...