perceiver-pytorch network can't train when incorporate this

network can't train when incorporate this

Open abeyang00 opened this issue 3 years ago • 5 comments

i have added perceiver to my current network and it seems like network can't be trained. AP is zero all the way and doesn't train at all.

Does the code need to be changed in order to incorporate into another network?

Mar 25 '21 09:03 abeyang00

I saw the same problem. In fact, it doesn't work well in FP16, I'm getting NaNs really quick (generally at epoch 2). Maybe try FP32? Sometimes it doesn't converge too. Here is my code: https://github.com/clementpoiret/Perceiver_MNIST

Apr 15 '21 08:04 clementpoiret

@clementpoiret I took a quick look at your repo: Are you trying to classify MNIST?

Having not used it myself yet, I think the user needs to specify the objective by adding a head to the Perceiver (e.g., a classifier head).

Apr 15 '21 20:04 amqdn

@clementpoiret

Never mind. I see in the code now that to_logits includes a Linear layer to num_classes, and that you've also included that in your code. Huh.

Apr 16 '21 01:04 amqdn

Yes you're right, I tried this quickly. But it's pretty slow to converge, and sometimes it doesn't even learn at all

Apr 16 '21 09:04 clementpoiret

Maybe you should try warmup learning rate sceduler? Transformer is particularly sensitive to learning rate scheme.

Jul 06 '21 07:07 OctoberKat

perceiver-pytorch perceiver-pytorch copied to clipboard

network can't train when incorporate this

perceiver-pytorch
perceiver-pytorch copied to clipboard