LexiconNER icon indicating copy to clipboard operation
LexiconNER copied to clipboard

Are ``train.XXX.txt'' generated by dictionaries?

Open LorrinWWW opened this issue 5 years ago • 0 comments

I merged datasets of all entity types (i.e. all train.XXX.txt), and I directly trained the vanilla BiLSTM+CRF on the merged one. The overall F1 was exceeding 90.0 (seems unreasonably high, considering it was generated by dictionaries). Did I misunderstand anything? Many thanks!

I use 100 dimensional glove embeddings, 30 dimensional character embeddings (by a LSTM). The hidden dimension is 200 (i.e. 100 for each direction). The dropout rate is 0.5. The optimizer is SGD, with learning rate of 0.01. The batch size is 32.

LorrinWWW avatar Dec 24 '19 06:12 LorrinWWW