joelxiangnanchen

Results 1 comments of joelxiangnanchen

@fawazsammani Hi, in original paper, author fed linear-transformed current hidden state, previous word embedding and current context vec into prediction layer like equation above. LSTM's input is last time step's...