TextFooler
TextFooler copied to clipboard
POS filter - why 'NOUN' and 'VERB' can be replaced by each other
I read the source code in criteria.py
, and found the function of pos_filter. However, I don't understand why you have it set this way by considering set([ori_pos, new_pos]) <= set(['NOUN', 'VERB']
as same = True
. Is there anyone could explain it? Thank you so much!
def pos_filter(ori_pos, new_pos_list):
same = [True if ori_pos == new_pos or (set([ori_pos, new_pos]) <= set(['NOUN', 'VERB']))
else False
for new_pos in new_pos_list]
return same
This is a good question. i have to admit that this expression is trying to bypass the noun and verb sets so that we do filter if both original and new pos belong to noun and verb sets. this is because at the time of experiments, i did not have time to carefully design the fine-grained pos filtering rule and there are several fine-grained subtypes for noun and verb. we need more complex rules to avoid some false negative examples. fo example, a noun can be replaced by a VBG.