pfeatherstone comments

Results 403 comments of


                                            pfeatherstone

[FEATURE] convert model for 1D inputs

I have to say, the results were really poor. The better option was to concert my 1D data into 2D by using torch.stft() then using a normal 2D model...

[FEATURE] convert model for 1D inputs

Tried this again optimistically thinking I would get better results. Nope...

Causal mask

I missed the `== 0`. You could just use; ``` bias = torch.triu(torch.ones(5, 5), diagonal=1) attn = attn.masked_fill(bias[:,:,:T,:T], float('-inf')) ```

How to handle gradient accumulation with multiple models ?

Very interested in this. I'm training two models at once and can only use batch sizes of less than 5 on my machine... So gradient accumulation would be great

Migrate to FindCUDAToolkit.cmake

I did have a branch at some point in the past that cleaned up all of dlib's cmake stuff including CUDA. I can try to revive that at some point.

[Bug]: Wrong gradients in the CUDA implementation of Layer Norm

I could be naive here, is there a reason why Layernorm isn't using CUDNN ? Will `cudnnNormalizationForwardInference`, `cudnnNormalizationForwardTraining` and `cudnnNormalizationBackward` work ? It looks like those functions can be used...

[Bug]: Wrong gradients in the CUDA implementation of Layer Norm

Somewhere in the docs I read it could be used for multiple types of normalization... I agree, it's hard to believe cudnn doesn't have first class support for it. Maybe...

Slicing odd and even indices

In fact i have a 1d complex tensor disguised as a 2d tensor. And i want to extract the real and imaginary parts of both the even and odd samples....

Slicing odd and even indices

do a reshape first to go from [B,2] to [B//2,2,2] then slice appropriately with some flattening if required

Slicing odd and even indices

@nicolaspanel I'm stuck on this again. Can you help?