sru DrQA tasks doesn't perform good

DrQA tasks doesn't perform good

Open anis016 opened this issue 5 years ago • 1 comments

Hi!

I am working on DrQA from Hitvoice/DrQA

when I run with SRU instead of LSTM my performance is poor compared to the ones reported here taolei87/DrQA

dev EM: 57.152317880794705 F1: 68.38075809892332

I only changed this part to Hitvoice/DrQA

parser.add_argument('-e', '--epochs', type=int, default=60) parser.add_argument('-lr', '--learning_rate', type=float, default=0.001)

The runtime reduced for each epoch from 8mins to 5mins. But, what could be hindering to get the good F1/EM ?

Aug 16 '18 09:08 anis016

Hi @anis016

I don't know if there are significant changes to Hitvoice/DrQA after I forked the repo and add SRU support.

There are a couple of changes I made so the model gets trained as expected:

(1) comment out the dropout (see here) in StackedBRNN._forward_unpadded(), because SRUCell already applies dropout on hidden states;

(2) make sure the training opts are the same as (https://github.com/taolei87/sru/blob/v2/DrQA/train.py#L21-L86); in particular, --concat_rnn_layers is False since SRUCell uses skip connections instead of concatenation at the end; rnn_dropout is 0.2 in my experiments.

(3) (optional) you can switch to v2 branch; this version has improved performance for deeper models.

Let me know if you have other questions.

Aug 16 '18 15:08 taolei87

sru sru copied to clipboard

DrQA tasks doesn't perform good

sru
sru copied to clipboard