LexiconNER
LexiconNER copied to clipboard
Lexicon-based Named Entity Recognition
Hello, Thanks for sharing this code. This code is extremely easy to use and very readable. I wanted to know if there are some practical considerations for deciding upon the...
你好!我已经获取了ontonotes4.0原数据集,但是不知道如何处理,网上只有5.0的处理教程。还希望能分享一下4.0数据集预处理流程
作者您好,我尝试了将PU算法这篇复现到中文数据集ResumeNER上,当时通过不断尝试loss的权重参数和正例的比例参数成功了一类,但是其他几类就无法复现了,对于这两个参数的选择感觉也很玄学,所以想请教一下您这两个参数具体的设置原理是什么以及您是否有在中文数据集上做过尝试,万分感谢!
中文的如何弄?
+ 作者您好,我有一个疑惑希望可以得到您的解答 + 你基于英文的每一个词是可以拆分到单个字母,做embeding, 中文的话是否要分词?但是分词存在边界很有可能就错了 + 在一个是实体词重合,你是怎么考虑的?比如说,北京上海都是一个国际大都市, 会出来 [1,1,1,1,0,0,0,0,0,0,0,0], 那现在的做法是直接将 北京上海看成一个实体词吗?
PER 和 ORG 的 python feature_pu_model.py --dataset conll2003 --flag PER 可以正常得到结果,但是LOC和MISC始终是 Precision: 0, Recall: 0.0, F1: 0,得不到结果
It is not immediately clear how to modify this repository for NER on an unlabeled data set with new classes. For example, the files `ada_dict_generation.py` and `adaptive_pu_model.py` both require model...
Thanks for the paper and code ! The calculation of risk in bnPU setup is a little confusing. In the paper, the non-negative makes the Risk = Pi * Prisk...
When I run the feature_pu_model.py I get the following error: Traceback (most recent call last): File "feature_pu_model.py", line 11, in from utils.data_utils import DataPrepare ModuleNotFoundError: No module named 'utils.data_utils' Could...
I merged datasets of all entity types (i.e. all train.XXX.txt), and I directly trained the vanilla BiLSTM+CRF on the merged one. The overall F1 was exceeding 90.0 (seems unreasonably high,...