denizyuret issues

Results 64 issues of


denizyuret

more efficient cuda kernel calls

Compiling our kernels into a shared library and using them with ccall's may not be the most efficient method. Check out packages under JuliaGPU (CUDANative.jl etc.) for more kernel launching....

speed

KnetArray

project

keyword args in broadcasted funcs not supported

This does not work: ``` function relu(x::T; max_value=Inf, negative_slope=0, threshold=0) where T (x >= max_value ? oftype(x, max_value) : x >= threshold ? x : negative_slope == 0 ? zero(T)...

cannot take derivative of a scalar function

See https://github.com/denizyuret/Knet.jl/issues/410

bug

1.0.0 Todo list

- [x] broadcast of user defined functions not supported: #101 - [x] Solve outstanding bugs and issues. - [ ] Review and merge pull requests. #54 #57 - [ ]...

compat

Coding practices

@CarloLucibello I am responding to your comments here to make the discussion easier: > can we avoid exporting 2 letters names to avoid conflicts and improve code readability? I need...

Most bessel functions do not preserve input type

``` julia> x 0.22919536f0 julia> besseli(2,x) 0.006595105469050567 ```

Missing pooling options, inconsistencies with CuArrays

There are two mean pooling operators in cudnn, one that includes and one that excludes the padded values. The current meanpool operation only supports including the padded values, we should...

enhancement

conv does not support group argument or channelmajor format

I did not find this in other issues, if not elsewhere should add it to our todo list. For a description of grouped convolutions see: https://towardsdatascience.com/a-comprehensive-introduction-to-different-types-of-convolutions-in-deep-learning-669281e58215

enhancement

cannot initialize CudaArray with Int32 size

This took me a while to figure out today, while debugging the new CUSPARSE package: ``` ERROR: LoadError: MethodError: `convert` has no method matching convert(::Type{CUDArt.CudaArray{T,N}}, ::Type{Int32}, ::Tuple{Int32}) This may have...

triggering gc based on gpu memory

Is there any progress on making gc sensitive to remaining gpu memory? The following example still fails with an out-of-memory error. It works if you uncomment the manual gc() line....

enhancement