pykt-toolkit icon indicating copy to clipboard operation
pykt-toolkit copied to clipboard

IEKT在数据集Bridge2006跑的时候出现不收敛的情况

Open MyGithub1234567890 opened this issue 1 year ago • 3 comments

MyGithub1234567890 avatar Dec 05 '23 07:12 MyGithub1234567890

f3a2747b61de6ce8701217ff30fa6ee

MyGithub1234567890 avatar Dec 05 '23 07:12 MyGithub1234567890

f3a2747b61de6ce8701217ff30fa6ee

我理解在IEKT的训练过程中引入了Policy Gradient的强化学习算法(论文section4.3 Model Learning), 所以loss会出现震荡. 不过可以看到valid auc一直有在上升, 模型一直有在学, 直到达到我们设定的early stop.

sonyawong avatar Dec 07 '23 16:12 sonyawong