TextFooler icon indicating copy to clipboard operation
TextFooler copied to clipboard

POS filter - why 'NOUN' and 'VERB' can be replaced by each other

Open sharon-gao opened this issue 4 years ago • 1 comments

I read the source code in criteria.py, and found the function of pos_filter. However, I don't understand why you have it set this way by considering set([ori_pos, new_pos]) <= set(['NOUN', 'VERB'] as same = True. Is there anyone could explain it? Thank you so much! def pos_filter(ori_pos, new_pos_list): same = [True if ori_pos == new_pos or (set([ori_pos, new_pos]) <= set(['NOUN', 'VERB'])) else False for new_pos in new_pos_list] return same

sharon-gao avatar Sep 13 '20 01:09 sharon-gao

This is a good question. i have to admit that this expression is trying to bypass the noun and verb sets so that we do filter if both original and new pos belong to noun and verb sets. this is because at the time of experiments, i did not have time to carefully design the fine-grained pos filtering rule and there are several fine-grained subtypes for noun and verb. we need more complex rules to avoid some false negative examples. fo example, a noun can be replaced by a VBG.

jind11 avatar Sep 13 '20 17:09 jind11