alerem18 comments

Results 18 comments of


                                            alerem18

Convolutional network slower than tensorflow on CPU

> That's not the intended use for `Flux.train!`. This function is meant to iterate over an entire epoch, not a single batch. Try writing your loop as > > ```julia...

Convolutional network slower than tensorflow on CPU

will there be any updates?

Convolutional network slower than tensorflow on CPU

whatever it is, it's related to backward path, feed forward path is in flux is already faster than pytorch, or same speed at least

can't use masks in multi-head-attention layer

> The layer's documentation for the forward pass says: > > ``` > (mha::MultiHeadAttention)(q_in, k_in, v_in, [bias]; [mask]) > ... > mask: Input array broadcastable to size (kv_len, q_len, nheads,...

can't use masks in multi-head-attention layer

> @alerem18 which of the two reshaping is correct in your case? reshape(mask, (seq_len, 1, 1, batch_size))

can't use masks in multi-head-attention layer

> > @alerem18 which of the two reshaping is correct in your case? > > reshape(mask, (seq_len, 1, 1, batch_size)) however masking is wrong it should be in the shape...

can't use masks in multi-head-attention layer

masking with shape (seq_len, 1, 1, batch_size) is ok but with shape (1, seq_len, 1, batch_size) return NaN

explicit differentiation for RNN gives wrong results

> I'm surprised this works at all with the input format given. What does the PyTorch code look like and have you verified it's doing the same thing? what should...

explicit differentiation for RNN gives wrong results

pytorch is quite different, it got a shape of (batch_size, seq_len, features) also i get much worse results by just reshape the data differently: `@cast x_train[i][j, k] := DATA_TRAIN[1][i, j,...

explicit differentiation for RNN gives wrong results

> > pytorch is quite different, it got a shape of (batch_size, seq_len, features) > > Flux supports something very similar. This is why it's important to see the PyTorch...