James Bradbury

Results 40 comments of James Bradbury

Hey, PyTorch and Chainer contributor here. I'm really excited about Knet and Julia deep learning in general, especially because Julia can solve many of the problems we keep running into...

Setting cudnn.enabled to False will use other GPU kernels; you have to remove the .cuda() calls if you want to use CPU. I think your best bet for the DyNet...

Those PyTorch plumbing changes have mostly already happened and 0.4 will have first-class scalars

This is related to support for explicit loops in the frontend. Supporting variable sizes (i.e. symbolic shapes in TVM) in general is probably a substantial change, but allowing explicit loops,...

An RNN kernel would look something like this: ```C def elman_rnn(float(T,B,Ci) input, float(B,Co) h0, float(Ci,Co) i2h, float(Co,Co) h2h) -> (hidden) { for t in T { if t == 0...

Unfortunately while MKL-DNN is OSS, it depends on the closed-source MKL (rather than using a generic BLAS interface). So it would be harder to integrate with than NNPACK, which (I...

I have a working, reasonably fast, but not very generic CUDA softmax in https://github.com/jekbradbury/Transformer.jl/blob/master/src/kernels.jl

I don't know if there's any particular reason Marian-NMT used dynamic shared memory for this rather than static. (Also, this kernel contains a reasonably fast mapreducedim implementation for reductions over...

Would the problem be solved if the naive softmax in https://github.com/JuliaGPU/CuArrays.jl/issues/45 were provided in NNLib for AbstractArrays rather than a CPU-specialized multithreaded implementation? That could then be overridden for `::GPUArray`...

I don't remember exactly, but one of the three sizes definitely refers to the size of the dimension being reduced over and the two others might be the size of...