pytorch_math_dataset
pytorch_math_dataset copied to clipboard
Beam search in dgl transformer
Really nice work! Have you found the reason why beam search fail in dgl transformer?
Thks ;) No, I haven't yet searched for that, I've been working on other topics lately. But I'm not sure if it's an issue with beam search or just the model being limited in its learnt notions. (Let say it's not so fast to train with my own GPU)
I use this code to train the whole dataset with basic transformer model. The training process seems correct as the loss keeps decreasing. But I found that the model can't get the right answer of basic arithmetic add_sub_multiple questions like "Evaluate 1 + 1 + (4 - 7) - -2." In origin paper, this sub module get very high accuracy. So have you also met that problem.
data:image/s3,"s3://crabby-images/0cce8/0cce855ebba36e2459f8a54d83510ff7f9360d0b" alt="loss log"
Do you mean that using a normal transformer, you have issues with beam search and not only with DGL transformer? If yes, it would help searching the cause because I had no idea yet...
yep! some modules work fine while others not. I could not locate the bug.