Jingye Li

Results 26 comments of Jingye Li

理解是正确的,另外把英文单词当做一个字也是没问题的,只要分词正确就行

There existed such a bug, but I thought it was fixed. I haven't encountered it since I updated the code. Did you try the bert version?

len(datasets[0])显示的是训练集的数据量,随后train_loader会将dataset处理成batch,每次训练通过循环来取一个batch训练,取完所有batch后记为一个epoch,因此len(train_loader)显示的是batch的数量,所以len(train_loader)会比len(datasets[0])小。train_loader加载了训练集并将处理成batch,本身就是用于训练的。

Hi, our code works when the _ner_ value is an empty list in the test set, although the f1 will be 0. I have tested it on the resume dataset,...

首先看看模型是否已经拟合,也可以尝试删除scheduler来增大模型的收敛速度;其次小batch size可能导致模型训练不稳定,可以尝试采用更小的len来截断,增大batch size;还可以分析一下模型测试集的输出,unified模型可能预测出了除flat以外其他类型的实体,导致了精准率偏低。

ADR means adverse drug reactions, you can find the detailed description in [Cadec: A corpus of adverse drug event annotations](https://www.sciencedirect.com/science/article/pii/S1532046415000532). The example is only for demostrating data structure, please replace...