dfdx
dfdx copied to clipboard
Deep learning in Rust, with shape checked tensors and neural networks
If, by accident, you mistype the dimensions of layers in your `model`, the compiler throws you an error at `model.forward_mut()` which does not hint at all to the actual error...
PyTorch supports an optional parameter for 2D convolutions called `padding_mode`, which defaults to `zeros`, but can also be `reflect`, `replicate` or `circular`. This changes what values it uses for the...
As brought up in reddit thread, a OpenCL device would be useful to support folks with AMD gpus. Here are roughly the tasks that need to happen: 1. [ ]...
We should consider whether it is possible and desired to automatically combine kernels into CUDA graphs to reduce overhead of calling individual kernels. Here is the relevant documentation: - https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs...
For data augmentation, it's very useful to have operations to mirror and rotate a tensor. See PyTorch's `flip` and `rot90` methods.
Hi! I've been trying to porting nanoGPT to Rust with dfdx. The `transformer` module is awesome! but it seems an important trick is missing, which is the attention mask in...
Language model progress has been rapid recently and with the llama weights being released, so much progress is being made on the c++ side https://github.com/ggerganov/llama.cpp I see that fp16 is...
For the purposed of effectively reducing the size of real time models, finding inefficiencies in the library, and testing modifications to the library, there should be a type of "benchmarking"...
Currently only support 1d arrays/vecs
It would be really useful to have a learning rate scheduler trait to implement various schedules. The trait could be as general as something like: ```rust trait OptimScheduler { fn...