neural-network-cuda
neural-network-cuda copied to clipboard
Neural network from scratch in CUDA/C++
Neural network in CUDA/C++
• Description
• Usage
Description
This is an implementation of a neural net, completely from scratch, in CUDA/C++.
Usage
The code is by no means efficient and is meant as an introduction to CUDA only. Here is an overview of the various classes and functions:
Everything is implemented in both pure C++ (under CPU/) and CUDA/C++ (under GPU/). The syntax remains virtually identical, and there are only two points to bear in mind when switching between C++ and CUDA/C++:
- C++ and CUDA/C++ modules end with the suffixes
CPUandGPUrespectively. - Don't forget to allocate and destroy CUDA arrays via
cudaMallocManagedandcudaFree
-
linear.h/Linear_SUFFIX:-
Initialization:
Required arguments:
_bs(int): Batch size._n_in(int): Number of input features._n_out(int): Number of output features.
Optional arguments:
_lr(float): Learning rate.
-
forward: Runs a linear forward pass.Required arguments:
_inp(float*): Pointer to the input data._out(float*): Pointer for storing the output data.
-
update: Updates the weights and biases. -
backward: Performs a backward pass, storing the gradients in_inp.
-
-
relu.h/ReLU_SUFFIX:-
Initialization:
Required argument:
_sz_out(int): The number of input/output elements.
-
forward,backward: LikeLinear_SUFFIXbut for ReLU.
-
-
mse.h/MSE_SUFFIX:-
Initialization: Like ReLU.
-
forward: Dummy method for compatibility with the other modules and performing backpropagation; does not actually calculate the loss.Required arguments:
_inp(float*): Pointer to the predictions._out(float*): Pointer to the target values.
-
_forward: Calculates the MSE. This method is solely for calculating the loss and cannot be used during backpropagation.Required arguments: Like
MSE_SUFFIXbut_outmust have an extra element for storing the loss. -
backward: Performs a backward pass, storing the gradients in_inp.
-
-
sequential.h/Sequential_SUFFIX:-
Initialization:
Required arguments:
layers(std::vector<Module*>): Layers to be chained together.
-
forward: Cascades the modules inlayers.Required arguments:
inp(float*): Pointer to the input data.out(float*): Dummy argument, only for compatibility with the other forward methods and doesn't get used. The output is accesible via the last layer'soutattribute.
-
update: Updates every module inlayers.
-
-
train_SUFFIX: Trains a network with gradient descent.Required arguments:
seq(Sequential_SUFFIX): Sequential module to train.inp(float*): Pointer to the input data.targ(float*): Pointer to the target data.bs(int): Batch size.n_in(int): Number of input features.n_epochs(int): Number of epochs.
For end-to-end training with speed benchmarks, please run main.cpp or main.cu for the CPU and GPU respectively.