Corey Lowman issues

Results 78 issues of


Corey Lowman

Add tensor_ops::max_pool/avg_pool and nn::MaxPool/AvgPool

This will likely require it's own device function in the future, as I suspect this will be implement similarly to conv_forward where it's just looping over the inputs. Can this...

Distributions mega issue

similar to pytorch's distributions module https://pytorch.org/docs/stable/distributions.html Main uses for this are reinforcement learning I think for sampling from output distributions. First pass should include something like a base `Distribution` trait...

Update examples to use Module::forward_mut

Maybe add inference to each of the scripts as well to make it clear there is forward & forward_mut?

Update Dropout and DropoutOnIn to use Module::forward_mut

They should have the following behavior: 1. `Module::forward()` do nothing 2. `Module::forward_mut()` call dropout (i.e. if there is a tape, then it will modify, otherwise it will do nothing)

Parallelize conv_batched

This can be done with rayon. Parallelizing forward is relatively straightforward, as you can just `result.par_iter_mut().zip(...).foreach()` since filters & bias are ref borrowed. Parallelizing backward is less striaghtforward, but easiest...

Named Tensors for serialization & error reporting.

This came up in #67 and #107, and is also related to #101. It'd be nice for tensors in modules to have names that match how they are accessed. E.g....

Add RNN and LSTM modules in `nn`

Should have tuple input and outputs with hidden state.

Create cuda tensor from CudaSlice

Similar to how we can create a cpu tensor from a Vec, it'd be nice to be able to pass in a separately allocated CudaSlice. The new method of this...

good first issue

Operator fusion

Let's discuss how operator fusion might work in dfdx. I suspect it will require a lot of work. On cuda side of things it will at least require jit compiling...

ideation

optimization

expert

Add Gradient scaler

With the addition of `AMP` dtype, we also need to add gradient scaling, which is commonly used with AMP training. I think the frontend interface could look something like: ```rust...