pytorch_math_dataset icon indicating copy to clipboard operation
pytorch_math_dataset copied to clipboard

Beam search in dgl transformer

Open YongtaoGe opened this issue 5 years ago • 4 comments

Really nice work! Have you found the reason why beam search fail in dgl transformer?

YongtaoGe avatar Jul 25 '19 12:07 YongtaoGe

Thks ;) No, I haven't yet searched for that, I've been working on other topics lately. But I'm not sure if it's an issue with beam search or just the model being limited in its learnt notions. (Let say it's not so fast to train with my own GPU)

mandubian avatar Jul 25 '19 12:07 mandubian

I use this code to train the whole dataset with basic transformer model. The training process seems correct as the loss keeps decreasing. But I found that the model can't get the right answer of basic arithmetic add_sub_multiple questions like "Evaluate 1 + 1 + (4 - 7) - -2." In origin paper, this sub module get very high accuracy. So have you also met that problem.

loss log

YongtaoGe avatar Jul 25 '19 13:07 YongtaoGe

Do you mean that using a normal transformer, you have issues with beam search and not only with DGL transformer? If yes, it would help searching the cause because I had no idea yet...

mandubian avatar Jul 25 '19 14:07 mandubian

yep! some modules work fine while others not. I could not locate the bug.

YongtaoGe avatar Jul 25 '19 15:07 YongtaoGe