papernotes Semi-supervised Multitask Learning for Sequence Labeling

Semi-supervised Multitask Learning for Sequence Labeling

Open howardyclo opened this issue 7 years ago • 1 comments

trafficstars

Metadata

Authors: Marek Rei
Organization: University of Cambridge
Conference: ACL 2017
Link: https://goo.gl/h6p29c

Apr 16 '18 09:04 howardyclo

Summary

This paper propose a auxiliary bidirectional language modeling objective for neural sequence labeling, and evaluated on error detection in learner texts, name entity recognition (NER), chunking and part-of-speech (POS) tagging.

Notes on the name of this paper:

"Semi-supervised": Utilize the data without label, which is context words.
"Multitask Learning": Jointly training on labeling and predicting context words.

Neural Sequence Labeling Model

Bidirectional LSTM (dynamic weighting char+word embeddings)
Concatenating forward and backward hidden state as output hidden state at each time step.
Project each output hidden state to 1-layer feedforward net with tanh activation + softmax (or + CRF).
Code: https://github.com/marekrei/sequence-labeler

Language Modeling Objective

Since bidirectional LSTM has access to the full context on each side of the target token, they predict the next word only from the forward-moving hidden state and the previous word only from the backward-moving hidden state. (The hidden states are mapped to 1-layer, tanh projection before projecting to context word vocabulary using softmax)

Experimental Results

Table 1

Table 2

Table 3

Apr 16 '18 10:04 howardyclo

papernotes papernotes copied to clipboard

Semi-supervised Multitask Learning for Sequence Labeling

Metadata

Summary

Neural Sequence Labeling Model

Language Modeling Objective

Experimental Results

papernotes
papernotes copied to clipboard