keras-attention-augmented-convs
keras-attention-augmented-convs copied to clipboard
Does not work when training
trafficstars
Hi,
When I build a model and use the attntion-augmented conv as a first layer and then several convolutions and max polling layer it is fine when compiling the model. However, in training, it has an error for both Adam and SGD optimizers. it looks like the code has issues. not working in any setup for training . The main problem is for an input with size (64, 128, 1) inside the attention-augmented code it makes a 6-dimensional tensor which is more than 1B parameters!!!! I believe the code needs a small change
If you could post the stack trace that would be more helpful.
Could you post a code snippet where this issue occurs?