equinox issues

BatchNorm training instability

5

Hi, I think the use of running_mean and running_var during training time in BatchNorm causes training instability and increased learning_rate sensitivity. With momentum low (say 0.5) the layer works fine...

andrewdipper

eqx.nn.Conv2d slower than torch.nn.Conv2d

1

Hey there! @patrick-kidger Thank you for providing this amazing library! Highly appreciated. When comparing the speed of `eqx.nn.Conv2d` and `torch.nn.Conv2d` I was a surprised to find that the jitted version...

lowlorenz

Speeding up Newton Update

2

Hello Patrick, Thanks a lot for all the working you are putting in here. I have been working on a problem that I can solve with small scale neural network...

BabaYara

question

`nn.Linear` does not support broadcasting for the bias term

11

```python key = jax.random.PRNGKey(0) # (B, C, H*W) inputs = jnp.ones((3, 16, 64)) linear_layer = jax.vmap(eqx.nn.Linear( 16, 16, use_bias=False, key=key )) outputs = linear_layer(inputs) print(outputs.shape) ``` The above mentioned code...

ariG23498

RoPE Embeddings

9

Added RoPE embeddings from [the RoFormer paper](https://arxiv.org/pdf/2104.09864.pdf). I need to add this to my transformer to perform some tests first before I can mark this as ready. Also if it's...

Artur-Galstyan

what does eqx.experimental.set_state do in the following code?

4

I am using the following code snippet (specifically the function **load_torch_weights**) but it uses some equinox methods(**eqx.experimental.set_state** and **eqx.experimental.StateIndex**) which seem no longer supported in the latest version of equinox....

Basant1861

question

Lots of improvements to attention

1

- Support for autoregressive attention; - Includes support for zero-length queries, e.g. when populating the caches for the prompt. - Causal masking available by passing mask="causal"; - Support for multi-query...

patrick-kidger

Passing a vmapped submodule as field

5

Hello, I'm a new user of the equinox library so maybe my problem is obvious to solve but I cannot figure out how to do that properly. Basically I'm trying...

mayalenE

question

Using an Equinox model outside Python - Deserialisation

1

Hello Patrick, again thank you for the nice package. I wanted to ask whether there exists a way to deserialise an Equinox-trained model (in eqx format [json+bytes]) to be used...

stergiosba

question

How to initialize Optax Optimizer for two neural networks.

15

Dear All- I have a very simple question. I have two neural networks of type `MLP` and I want to initialize optimizer via `optax`. When I have one neural network...

raj-brown

question

equinox
equinox copied to clipboard

Metadata

BatchNorm training instability

eqx.nn.Conv2d slower than torch.nn.Conv2d

Speeding up Newton Update

`nn.Linear` does not support broadcasting for the bias term

RoPE Embeddings

what does eqx.experimental.set_state do in the following code?

Lots of improvements to attention

Passing a vmapped submodule as field

Using an Equinox model outside Python - Deserialisation

How to initialize Optax Optimizer for two neural networks.

← Metadata

Owner

Metadata

equinox equinox copied to clipboard

Metadata

← Metadata

Owner

Metadata

equinox
equinox copied to clipboard