Arraymancer
Arraymancer copied to clipboard
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Following refactoring of the autograd in #333, we got a huge perf an memory usage improvement by not creating a graph at all when in inference mode: 21% on example...
After a long time, I decided to finally learn Arraymancer, and I was really pleased to see that the library has grown much more than I expected! Kudos to all...
Due to the following template, variables cannot be dereferenced. https://github.com/mratsim/Arraymancer/blob/fc4ad528f6afcd377c16ff99f19151a0f5e46f89/src/autograd/ag_accessors.nim#L18-L28 This forces to use casting instead of `addr v[]` https://github.com/mratsim/Arraymancer/blob/fc4ad528f6afcd377c16ff99f19151a0f5e46f89/src/autograd/ag_data_structure.nim#L116-L120 Solution is probably easy. Dispatch to system.`[]` when varargs =...
Autograd, nn_primitives and nn are in master. To bring them to the same standard as the core tensor library, they need tests: - unit tests for individual pieces (like derivative...
Reference: - https://rufflewind.com/2016-12-30/reverse-mode-automatic-differentiation ![2018-12-15_23-39-12](https://user-images.githubusercontent.com/22738317/50048047-a5649a80-00c2-11e9-8b70-375a3199906e.png) - https://rufflewind.com/2016-12-30/reverse-mode-automatic-differentiation and https://github.com/Rufflewind/revad/blob/de509269fe878bc9d564775abc25c4fa663d8a5e/src/chain.rs ```Rust /// Maintains a partial tape using the count-trailing-zeros (CTZ) eviction /// strategy. This results in space usage that is logarithmic in...
In `ex01_bench` slicing is quite inefficient doing a relu activation is actually cheaper ![2018-12-15_17-15-14](https://user-images.githubusercontent.com/22738317/50045053-14bf9780-008d-11e9-92f6-9bbcd5aa7297.png) Focusing on the call tree, the slowness seems to be in implementation itself: ![2018-12-15_17-16-11](https://user-images.githubusercontent.com/22738317/50045063-2bfe8500-008d-11e9-9995-9300f692e3f1.png) ## Assembly...
I hope I am not bothering you too much by asking questions in the issue tracker, in case let me know. This time I am trying to port operations from...
The following is not parallel https://github.com/mratsim/Arraymancer/blob/a4b79c86c184caceb28e1ed506012babd03f8cd4/src/tensor/operators_blas_l1.nim#L31-L36 Pending [Laser](https://github.com/numforge/laser) and https://github.com/numforge/laser/pull/4
Seems like there is a new optimiser to supplant Adam called AMSGrad and introduced at ICLR 2018. Paper: https://openreview.net/pdf?id=ryQu7f-RZ Also Padam from "Closing the Generalization Gap of Adaptive Gradient Methods...
See https://github.com/mratsim/Arraymancer/pull/304#issuecomment-425751045 When slicing a tensor returns a scalar we lose track of the autograd graph: https://github.com/mratsim/Arraymancer/blob/0d31645fbce3dfdc5ffc8575402c10726747e40c/src/autograd/gates_shapeshifting_views.nim#L41-L46 This might require #87 or using object variant in Variable to store either...