minGPT
minGPT copied to clipboard
TPU/GPU training: KeyError 'pos_emb'
Hi,
I am currently testing the char notebook. Everything works fine while CPU training, but if I try to execute the same code on a GPU/TPU the following error occurs:
Exception has occurred: KeyError 'pos_emb'
If I simply remove the problematic code line:
no_decay.add('pos_emb')
It kind of works also in GPU/TPU training but the loss oscillation gets stuck and practically no improvement (or opposite) is made while training like it happens while CPU training where the loss is obviously oscillating with same code base.
Can anyone explain to me how it is possible to solve this KeyError without corrupting the no_decay set? Thanks a lot! :)