Sean Moriarity
Sean Moriarity
Currently missing gradients for these functions: - [x] Product - [ ] Scatter Window Max - [ ] Scatter Window Min - [x] Scatter Add - [x] LU Decomposition -...
WIP: The goal is to replace `*_unroll` entirely with this more general `scan` and to also use it to add both `Axon.scan` and `Axon.bidirectional` combinators
It is just confusing, and I don't like the implicit behavior. It is easy enough for Axon to correctly do the reshape logic even with `nil` dimensions
With the new container API, we can support structs. Right now the Axon.Updates API uses a tuple to represent optimizer state. It's not a bad, but I think it's better...
We can implement table and mermaid displays in a separate Axon.Display module: - [x] `Axon.Display.as_table` - [ ] `Axon.Display.as_mermaid` And potentially many others :)
Resolves #169 Still a few API considerations/questions: 1. Should we offer an explicit `:parameters` option on layer creation? 2. How should we handle partial parameter sharing, e.g. if I just...
Currently custom layers will fail when they are imported if the implementation is not an MFA function (`&Module.function/arity`). We should consider raising in these cases, or support some other way...
We should consider unwrapping containers by default in Axon custom layers. So for example if you have a custom layer: ```elixir defn custom_layer(foo, bar, _opts) do {foo, bar} end ```...
There are cases where the input to a model is an integer type, e.g. an attention mask or token IDs. Axon currently does not respect input types, and aggressively casts...
There is currently no explicit way to use the same weights/layer for separate inputs. For example, in Keras I could do: ```python dense = tf.keras.layers.Dense(32) d_x1 = dense(x1) d_x2 =...