attention-networks-for-classification Having 2 optimizers

Having 2 optimizers

Open JoaoLages opened this issue 6 years ago • 3 comments

Hi there! Thank you for making this implementation open-source! I have one question though: Although you have one backward step, you have 2 optimizers. shouldn't you combine both model's parameters and use only one optimizer?

Aug 22 '18 09:08 JoaoLages

In hindsight, I would have used a single optimiser using something like this.

optim.Adam(list(model1.parameters()) + list(model2.parameters())

At that time, I was new to PyTorch and didn't know this. You can go ahead and use 1 optimiser for a much cleaner code.

Aug 22 '18 10:08 Sandeep42

Thanks for your reply. That is what I am doing. Nevertheless, it seems that while using 2 optimizers the loss lowers way faster than comparing with one optimizer. what might be the reason for this?

Moreover, I have changed the optimizer to Adam but havent been able to get a BCE loss lower than ~0.255 for a multi-label classification problem. Any suggestions?

Aug 22 '18 11:08 JoaoLages

Nevermind, I had a typo, 2 optimizers vs 1 optimizers produces more or less the same it seems. Still having the loss problem though

Aug 22 '18 11:08 JoaoLages

attention-networks-for-classification attention-networks-for-classification copied to clipboard

Having 2 optimizers

attention-networks-for-classification
attention-networks-for-classification copied to clipboard