TextAttack
TextAttack copied to clipboard
[WRONG augmentation] pos_list.append(token.annotation_layers["pos"][0]._value) ->KeyError: 'pos'
I run the following code in Colab:
#pip3 install textattack[tensorflow]
from textattack.augmentation import CLAREAugmenter
augmenter = CLAREAugmenter(pct_words_to_swap=0.2, transformations_per_example=5)
s = "I'd love to go to Japan but the tickets are 500 dollars"
augmenter.augment(s)
and got the following error
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-9-7c7856df0718> in <module>()
3 augmenter = CLAREAugmenter(pct_words_to_swap=0.2, transformations_per_example=5)
4 s = "I'd love to go to Japan but the tickets are 500 dollars"
----> 5 augmenter.augment(s)
6 frames
/usr/local/lib/python3.7/dist-packages/textattack/shared/utils/strings.py in zip_flair_result(pred, tag_type)
235 word_list.append(token.text)
236 if "pos" in tag_type:
--> 237 pos_list.append(token.annotation_layers["pos"][0]._value)
238 elif tag_type == "ner":
239 pos_list.append(token.get_tag("ner"))
KeyError: 'pos'
System Information
Python==3.7.0
Name: textattack
Version: 0.3.8
Summary: A library for generating text adversarial examples
Home-page: https://github.com/QData/textattack
Author: QData Lab at the University of Virginia
Author-email: [email protected]
License: MIT
Location: d:\tools\anaconda3\envs\textattack\lib\site-packages
Requires: bert-score, click, datasets, editdistance, filelock, flair, jieba, language-tool-python, lemminflect, lru-dict, more-itertools, nltk, num2words, numpy, OpenHowNet, pandas, pinyin, pycld2, PySocks, scipy, terminaltables, torch, tqdm, transformers, word2number
In the strings.py file, change the line 237 which is "pos_list.append(token.annotation_layers["pos"][0]._value)" to "pos_list.append(token.annotation_layers["upos"][0]._value)", restart runtime and it should work
In the strings.py file, change the line 237 which is "pos_list.append(token.annotation_layers["pos"][0]._value)" to "pos_list.append(token.annotation_layers["upos"][0]._value)", restart runtime and it should work
Thanks! The problem was successfully solved in this way.
As for the second issue proposed in https://github.com/QData/TextAttack/issues/713#issue-1574652467,
I replace self._enptb_to_universal with an array with the relevant universal POS tags, in the \textattack\transformations\word_swaps\word_swap_inflections.py file, as the following code.
self._enptb_to_universal = {
#-----dictionary with new tags---------
"PUNCT": ".",
"CCONJ": "CONJ",
"SCONJ": "CONJ",
"PROPN": "NOUN",
"PART": "PRT",
"AUX": "VERB",
"SYM": "NOUN",
"INTJ":"X",
#----original dictionary below----------
"JJRJR": "ADJ",
"VBN": "VERB",
"VBP": "VERB",
"JJ": "ADJ",
"VBZ": "VERB",
"VBG": "VERB",
"NN": "NOUN",
"VBD": "VERB",
"NP": "NOUN",
"NNP": "NOUN",
"VB": "VERB",
"NNS": "NOUN",
"VP": "VERB",
"TO": "VERB",
"MD": "VERB",
"NNPS": "NOUN",
"JJS": "ADJ",
"JJR": "ADJ",
"RB": "ADJ",
}
(Mapping info: https://github.com/slavpetrov/universal-pos-tags and https://zhuanlan.zhihu.com/p/427520069.)
In the strings.py file, change the line 237 which is "pos_list.append(token.annotation_layers["pos"][0]._value)" to "pos_list.append(token.annotation_layers["upos"][0]._value)", restart runtime and it should work
When modifying code in this way, although the PSO attack method can run, the effect is poor.