Results 75 comments of Sean Moriarity

@hansihe Thank you for this very detailed write up! It's really helpful for improving the framework. Also, would you be interested in adding your Yolo implementation upstream to Bumblebee? We...

I am still trying to think of a good way to handle these cases. I think for now you should wrap all inputs in a container and then you can...

Thanks for bringing this up, the `%Axon{}` data structure used to be the only data structure that represented a model/layer/etc. but now we have `%Axon.Node{}` and as you pointed out...

There is a new API `Axon.mask` which does this that you can pass to `Axon.lstm` and other RNNs. Something like this should work: ```elixir input = Axon.input("seq") # pad token...

Btw, looking at this, it's not advisable to use dropout after an LSTM layer. See https://arxiv.org/pdf/1512.05287.pdf This is still a bug though

This issue should be fixed with the new `Axon.ModelState` changes - dropout keys and other model state are no longer considered part of the training parameters and so shouldn't accidentally...

This has been mostly resolved, there will be recompilations in certain cases, but I added a note to the docs about those!

See https://github.com/elixir-nx/bumblebee Maybe later it will make sense to move those attention implementations but for now it's okay :)

This is possible with blocks now