Korean NER with Pytorch

Korean NER Task with CharCNN + BiLSTM + CRF (with Naver NLP Challenge dataset), implemented with Pytorch

Model

	Train	Test
# of Data	81,000	9,000

Naver NLP Challenge 2018 NER Dataset (Github link)
Original github only has train dataset, so test dataset is created by splitting the train dataset. (Data link)

Use Korean fastText vectors with 300 dimension
It takes quiet long time to load from original vector, so I take out the word vectors that are only in word vocab.
It will be downloaded automatically when you run main.py.

$ python3 main.py --do_train --do_eval

Evaluation prediction result will be saved in preds dir when you give --write_pred option.

	Slot F1 (%)
CNN+BiLSTM+CRF	73.65
CNN+BiLSTM+CRF (+fastText)	74.57