sru
sru copied to clipboard
DrQA tasks doesn't perform good
Hi!
I am working on DrQA from Hitvoice/DrQA
when I run with SRU instead of LSTM my performance is poor compared to the ones reported here taolei87/DrQA
dev EM: 57.152317880794705 F1: 68.38075809892332
I only changed this part to Hitvoice/DrQA
parser.add_argument('-e', '--epochs', type=int, default=60) parser.add_argument('-lr', '--learning_rate', type=float, default=0.001)
The runtime reduced for each epoch from 8mins to 5mins. But, what could be hindering to get the good F1/EM ?
Hi @anis016
I don't know if there are significant changes to Hitvoice/DrQA after I forked the repo and add SRU support.
There are a couple of changes I made so the model gets trained as expected:
(1) comment out the dropout (see here) in StackedBRNN._forward_unpadded()
, because SRUCell already applies dropout on hidden states;
(2) make sure the training opts are the same as (https://github.com/taolei87/sru/blob/v2/DrQA/train.py#L21-L86); in particular, --concat_rnn_layers
is False since SRUCell uses skip connections instead of concatenation at the end; rnn_dropout is 0.2 in my experiments.
(3) (optional) you can switch to v2 branch; this version has improved performance for deeper models.
Let me know if you have other questions.