EntLM issues

MIT-Movie

14

您好，我注意到代码数据集提供了conll2003和ontonotes的少样本数据集和标签词，MIT-Movie的少样本和标签词是在哪里的呢？

qianmuuq

远程监督的数据从哪获得？

2

我看get_topk的代码里面要从远程数据里统计词频高的。如果我要换一个数据集或者换一个序列标注任务，不一定能找到有弱标签的数据集吧？此时该怎么找到标签集呢？

Godxia

无法复现实验结果

您好，我在跑run_conll.sh这个代码时，跑出来的结果准确率，recall,还有f1都为0,这是怎么回事呢

scofield687

4.5 Efficiency Study

你好，请问4.5 Efficiency Study的解码时间单位是秒还是分呀

bwl666

conll

您好，我按照程序中原本的设置在conll2003的K=5的数据集上进行训练，有时候得到的F1值很低，不太清楚可能的原因

yaoyh116

issue of the distant label data， the dataset/ontonotes/distant_data folder with a correct ontonotes dataset.

i'm sorry that i found your dataset/ontonotes/distant_data folder the train_dNUM.txt is different from the train_dNUM.json. where the train_dNUM.json is the correct ontonote train data rather than distant label data.

Yefeiyang-luis

关于Label Mapping

1

您好，根据我对论文的理解，数据集的label应该是类似于'Michael', 'John'这样的名词，然而，再调试时，我发现train_transformer.py的389行之后，label_token_map从下表 'I-PER':['Michael', 'John', 'David', 'Thomas', 'Martin', 'Paul'] 'I-ORG':['Corp', 'Inc', 'Commission', 'Union', 'Bank', 'Party'] 'I-LOC':['England', 'Germany', 'Australia', 'France', 'Russia', 'Italy'] 'I-MISC':['Palestinians', 'Russian', 'Chinese', 'Dutch', 'Russians', 'English'] 变成了下面4个新字符 'I-PER':'I-PER' 'I-ORG':'I-ORG'...

huangjia2019

questions about label_frac.json and distant_data

7

How to build label_frac.json and distant_data?

WenxiongLiao

EntLM
EntLM copied to clipboard

Metadata

MIT-Movie

远程监督的数据从哪获得？

无法复现实验结果

支持中文ner吗

4.5 Efficiency Study

conll

在全样本有优势吗？

issue of the distant label data， the dataset/ontonotes/distant_data folder with a correct ontonotes dataset.

关于Label Mapping

questions about label_frac.json and distant_data

← Metadata

Owner

Metadata

EntLM EntLM copied to clipboard

Metadata

← Metadata

Owner

Metadata

EntLM
EntLM copied to clipboard