DiffNum
DiffNum copied to clipboard
A light-weighted and flexible C++ differentiable programming library. Just replace float and double with it, and it does Auto-Grad for you...
DiffNum
A light-weight head-only c++ library for differentiable programming. Unlike the popular TensorFlow and Torch, DiffNum is implemented simply with forward inference with chain rule, instead of computation graph, source code transformation or other high level autograd algorithms. Thus it takes few efforts to implement and apply.
Features
Advantages
- Extremely easy to use and flexible. Just replace
floatdoublewithdfloatddouble, and specify the independent variables viasetVar(). Then it does autograd for you. The gradients can be accessed at any stage of computation. And it has flexible indexing, because the differentiating is applied to single variables instead of vectors or tensors. - Saving memory in long iterations. It does not record the computation graph (like tape-based approach of Torch). So it is efficient when the computation involves a large number of iteration, especially self-accumulation iterations.
- Secondary derivatives and higher order derivatives supported. Higher order derivatives are derived via recursive template definition
DiffVar. It might be written likeDiffVar<DiffVar<double, 0>, 0>. It means autograding the gradients, that is the secondary derivatives or Hessian matrices. - CUDA supported. DiffNum can be even used in CUDA kernel functions. Just replace
float,doublewithdfloat_cuda, andddouble_cuda. This data structure can be seamlessly applied in CUDA functions as local variables, parameters or other purposes. (Please note that the CUDA code auto-grad loops of a single variable are still sequential, not parallel. In large scale computations, the parallelism should be given to higher level of computations like matrix operations.) - (Extra and in progress) . Independent from DiffNum, we offer template classes
Vec,MatrixandTensor. They might be kind of useful.
Disadvantages
- The time complexity is greatly many times of reversed differentiating algorithms (back propagation) when there is large number of independent variables. Thus it can be extremely inefficient when there are many variables!
Usage
-
Using the differentiable variable
We offer the differentiable variable template classes:
DiffVar<d_type, size>, whered_typecan befloat,double, or even differentiable variable type. The template parametersizeis the number of independent variables, that is the length of gradient vector. Ifsizeis zero, the independent variables will be uncertain and can be dynamically changed.To assign value to
DiffVarvariables is just like usingfloatordoublevariables. The can be assigned values simply usingoperator=. Real values can be directly assigned toDiffVarvariables.For short, you can use
dfloat<size>,ddouble<size>. They are the same toDiffVar<float, size>andDiffVar<double, size>Here is an example
#include <DiffNum.h> #include <iostream> using namespace DiffNum; // Let's define a function: f(x, y, z) = 2*x^2 - 3*y*z + 1. And apply DiffVar to autograd. dfloat<3> f(float _x, float _y, float _z) { // We have 3 independent variables to study: x, y, z dfloat<3> x = _x, y = _y, z = _z; // The independent variables must be specified, otherwise they will be treated as constants. Here, let x be the 1st, y the 2nd, z the 3rd. Their indices are 0, 1, and 2 respectively. x.setVar(0); y.setVar(1); z.setVar(2); // Then use them like using floats. return 2 * x * x - 3 * y * z + 1.0; } int main(void) { dfloat<3> u = f(3.7, 4.0, 4.3); std::cout << u << std::endl; // DiffNum can be directly outputted to ostream. // Output: -23.22(14.8, -12.9, -12) // The first real number is the value of u. The following vector is the gradient to (x, y, z) return 0; } -
Access the value and gradients
To access the value:
.getValue()To access the gradient, use operator
[]. With higher-order derivatives, use multiple[]to get the derivatives. For examplea[1], b[1][2]. -
Higher-order derivatives
Use
DiffVarrecursively. For exampleDiffVar<DiffVar<double, 3>, 3>. Currently, the size of gradients of each recursion must be the same. -
Primary mathematical functions
For
DiffVar, we provide mathematical functions that performs autograding. They are in template classMath<T>, whereTis the type of theDiffVaryou are using. -
CUDA supported DiffNum
To make DiffNum available in CUDA programs which are needed for many scientific computation tasks, we offer a CUDA version. Use
DiffVar_cudaandMath_cudain both host codes and device codes.DiffVar_cudacan be parameters of__global__functions. -
Directly memcpy DiffVar or DiffVar_cuda arrays ?
Except for
DiffVar<d_type, 0>, which is dynamic, all otherDiffVararrays can be directly copied.
Short Examples
Example 1. a, b are independent variables. c = a+b; d = log(max(sin(a/c), b))
using dmath = Math<ddouble<0>>;
// Example 1. a, b are variables. c = a+b; d = log(max(sin(a/c), b))
ddouble<0> a = 2., b = 3.;
// 2 total variables, a is the first, b is the second
a.setVar(2, 0); b.setVar(2, 1);
auto c = a + b;
auto d = dmathd::Log(dmath::Max(dmathd::Sin(a / c), b));
std::cout << d << std::endl;
Example 2. Vec v1 v2. v1[2] is the variable. q = v1 dot v2. We also offer dense Vec and Mat . Since DiffVar is so similar to float and double, they can be easily adopted into any advanced numerical structure.
// Example 2. Vec v1 v2. v1[2] is the variable. q = v1 dot v2.
Vec<ddouble<0>, 3> v1, v2;
v1[0] = 8.7;
v1[1] = 4.3;
v1[2] = 7.;
v2[0] = -6.7;
v2[1] = 4.1;
v2[2] = 2.3;
// Set v1[2] as the only variable.
v1[2].setVar(1, 0);
auto q = Vec<ddouble<0>, 3>::dot(v1, v2);
std::cout << q << std::endl;
std::cout << std::endl;
Example 3. Evaluating secondary derivative.
// Example 3. Evaluating secondary derivative.
using ddmath = Math<dddouble<2>>;
dddouble<2> x = 2., y = 3.;
x.setVar(0); y.setVar(1),
std::cout << "x := 2, y := 3" << std::endl;
std::cout << "x^3 + 2*y^2 = ";
std::cout << ddmath::Pow(x, unsigned int(3)) + 2. * ddmath::Pow(y, unsigned int(2)) << std::endl;
std::cout << "x + x^3*y + x*y + 2*y = ";
std::cout << x + ddmath::Pow(x, unsigned int(3)) * y + x * y + 2. * y << std::endl
Install & Build
This is a head-only library. Just clone this repository and include the headers in your codes.
#include <DiffNum.h>
And for CUDA applications
#include <DiffNum_cuda.h>
By the way
Thanks to this project, I learned CUDA...