LexiconNER
LexiconNER copied to clipboard
Are ``train.XXX.txt'' generated by dictionaries?
I merged datasets of all entity types (i.e. all train.XXX.txt), and I directly trained the vanilla BiLSTM+CRF on the merged one. The overall F1 was exceeding 90.0 (seems unreasonably high, considering it was generated by dictionaries). Did I misunderstand anything? Many thanks!
I use 100 dimensional glove embeddings, 30 dimensional character embeddings (by a LSTM). The hidden dimension is 200 (i.e. 100 for each direction). The dropout rate is 0.5. The optimizer is SGD, with learning rate of 0.01. The batch size is 32.