GRU4Rec_TensorFlow icon indicating copy to clipboard operation
GRU4Rec_TensorFlow copied to clipboard

It seems "top1" loss function do not work in this tf implementation?

Open ownership-xyz opened this issue 7 years ago • 5 comments

It seems "top1" loss function do not work in this tf implementation?
I follow the parameters setting like that it theano implementation. However, I just get a result of 0.48 and 0.17. (Original is 0.59 and 0.23)

I use tensorflow 1.2.

ownership-xyz avatar Nov 14 '17 04:11 ownership-xyz

Hi @IcyLiGit , Sorry about that you can't reproduce the results. First, I didn't test the code with "top1" loss, thus I have no idea how to set "good" parameters, either. Second, did you get the numbers 0.59 and 0.23 by running theano implementation? BTW, I use Adam optimizer defaultly and you may try RMSProp(default optimizer in theano implementation) or others. Tuning initial learning rate and dropout rate would also help. Good Luck!

Weiping

Songweiping avatar Nov 14 '17 06:11 Songweiping

@Songweiping I had ran the original theano code and it produced the right result as that in the paper.Actually, in theano code, the optimizer is adagrad. However, I found adagrad and adadelta do not work in tensorflow implementation.

RMSprop and adam with cross-entropy loss and softmax activation function may work in your implementation. However, top1 and bpr can just produce a result of 0.48 (not 0.6 in paper) , and it seems loss value decrease quicker in tf. (Maybe caused by overfitting? But I can not find the differences of these two implementation....)

ownership-xyz avatar Nov 14 '17 07:11 ownership-xyz

I use the same parameter settings in these two implementations. (Softmax + cross-entropy + 0.5 drop + 0.001 lr without decay). However, the losses reported are different.

theano: 2017-11-14 3 46 07

tensorflow: 2017-11-14 3 45 45

ownership-xyz avatar Nov 14 '17 07:11 ownership-xyz

It seems that TF converges faster than Theano. So how about:

  1. decease training steps.
  2. more concretely, use validation data to prevent over-fitting(early stop).

Weiping

Songweiping avatar Nov 14 '17 11:11 Songweiping

I find the similar issue too I add dynamic_rnn to Weiping's code, the recall then drop to 0.43 for Softmax + cross-entropy the recall is 0.43 for top1

and it's not overfitting, I have check the recall on training data @Songweiping @IcyLiGit

gds123 avatar Jun 12 '18 11:06 gds123