MiniDNN
MiniDNN copied to clipboard
new compiler flag for storage order
Hi This is a bit along the same lines as the last PR, but with storage order. I noticed there were a lot of Matrix and Vector typedefs, so I pulled them all together into Config.h, and then decided it would be nice to be able to define storage order, i.e. row or column major, at compile time.
So there's a new compiler flag MDNN_ROWMAJOR that if set to 1 will mean matrices will be row-major.
Test: (I'll attach the mostly unchanged example.cpp for convenience)
> g++ -I ./include/ example.cpp -DMDNN_ROWMAJOR=1
> ./a.out
(base) ben@Ben-xubuntu:~/MiniDNN$ ./a.out
IsRowMajor?: 1
[Epoch 0, batch 0] Loss = 0.328066
[Epoch 1, batch 0] Loss = 0.327707
[Epoch 2, batch 0] Loss = 0.327475
[Epoch 3, batch 0] Loss = 0.327273
[Epoch 4, batch 0] Loss = 0.327095
[Epoch 5, batch 0] Loss = 0.32692
[Epoch 6, batch 0] Loss = 0.326753
[Epoch 7, batch 0] Loss = 0.326593
[Epoch 8, batch 0] Loss = 0.326437
[Epoch 9, batch 0] Loss = 0.326274
> g++ -I ./include/ example.cpp
> ./a.out
IsRowMajor?: 0
[Epoch 0, batch 0] Loss = 0.32792
[Epoch 1, batch 0] Loss = 0.326679
[Epoch 2, batch 0] Loss = 0.325873
[Epoch 3, batch 0] Loss = 0.325187
[Epoch 4, batch 0] Loss = 0.324576
[Epoch 5, batch 0] Loss = 0.324013
[Epoch 6, batch 0] Loss = 0.323497
[Epoch 7, batch 0] Loss = 0.323033
[Epoch 8, batch 0] Loss = 0.322599
[Epoch 9, batch 0] Loss = 0.322178
example.cpp - The only changes are:
- it's using the new MiniDNN::Matrix
- it prints if a matrix is row major
#include <MiniDNN.h>
using namespace MiniDNN;
int main()
{
// Set random seed and generate some data
std::srand(123);
// Predictors -- each column is an observation
Matrix x = Matrix::Random(400, 100);
// Response variables -- each column is an observation
Matrix y = Matrix::Random(2, 100);
std::cout << "IsRowMajor?: "<< y.IsRowMajor << std::endl;
// Construct a network object
Network net;
// Create three layers
// Layer 1 -- convolutional, input size 20x20x1, 3 output channels, filter size 5x5
Layer* layer1 = new Convolutional<ReLU>(20, 20, 1, 3, 5, 5);
// Layer 2 -- max pooling, input size 16x16x3, pooling window size 3x3
Layer* layer2 = new MaxPooling<ReLU>(16, 16, 3, 3, 3);
// Layer 3 -- fully connected, input size 5x5x3, output size 2
Layer* layer3 = new FullyConnected<Identity>(5 * 5 * 3, 2);
// Add layers to the network object
net.add_layer(layer1);
net.add_layer(layer2);
net.add_layer(layer3);
// Set output layer
net.set_output(new RegressionMSE());
// Create optimizer object
RMSProp opt;
opt.m_lrate = 0.001;
// (Optional) set callback function object
VerboseCallback callback;
net.set_callback(callback);
// Initialize parameters with N(0, 0.01^2) using random seed 123
net.init(0, 0.01, 123);
// Fit the model with a batch size of 100, running 10 epochs with random seed 123
net.fit(opt, x, y, 100, 10, 123);
// Obtain prediction -- each column is an observation
Matrix pred = net.predict(x);
// Layer objects will be freed by the network object,
// so do not manually delete them
return 0;
}
Sorry, I am struggling with this one. Could you help out? The convolutions do memory copies...
I thought this was a good idea for quicker inference or perhaps training without transposing the dataset, but it seems a lot of work...
Yeah, my feeling is that transposing the data set is a much cheaper operation than the training and prediction process. Some of the internal operators are indeed hard-coded using the column-based storage, so it may involve a lot of work to refactor everything inside...
Thanks for the feedback. I think you are right.