vexcl Dense matrix support

Providing dense matrix support for VexCL would probably be of use for many people, but currently it is unclear what is the best approach to do this.

Options are:

Provide vex::matrix<> and make vex::vector<> a special case of a dense matrix with first dimension equal to one. This would not work very well, because some operations (like operator*()) have different default meaning from what currently is used.
Make vex::matrix<> completely unrelated type with its own grammar (set of operators). This should be similar to distinction that Eigen does between matrices and arrays. This looks more promising and should not break current functionality.

Apr 17 '13 06:04 ddemidov

Have you had a look at the eigen3 template library:

http://eigen.tuxfamily.org/index.php?title=Main_Page

I believe they decided to go with option 1.

Apr 22 '13 13:04 d-meiser

Hello Dominic,

I think that eigen choose option 3 actually: their matrices are compile-time convertible to arrays. Then, operator*() has linear algebra meaning for matrices and elementwise meaning for arrays.

I actually like this option more than the two above.

Apr 22 '13 14:04 ddemidov

Ah, yes. You're right. That way you get the best of both worlds.

Apr 22 '13 14:04 d-meiser

Eigen's arrays have only 2 dimensions though, I'd copy boost::multi_array, which allows an arbitrary number of dimensions, inherit vector with dims=1 and matrix with dims=2 and overloaded operator*. That way, if you want 2D arrays, but element-wise *, just use the multi_array type with dims=2 directly, or down-cast a matrix.

Apr 22 '13 15:04 neapel

Hi Pascal,

What operations would you support for multidimensional arrays? Are there any having sense for multidimensional matrices, except for elementwise arithmetics, element access and probably slicing (I am not sure about the possibility of general slicing with multi-gpu support though)?

Right now I am thinking of inheriting vex::matrix from vex::vector, and then overloading some of default operations like multiplication. This way elementwise operations would be directly and transparently supported.

Apr 22 '13 15:04 ddemidov

Ah, the number of dimensions is compile-time property of boost::multi_array. That makes things easier.

Apr 22 '13 15:04 ddemidov

Tensor products could generate multi-dimensional arrays. The tensor product of a d1 dimensional array with a d2 dimensional array is a d1+d2 dimensional array. Contractions reduce the dimensionality of an array. Reshaping could also be supported.

Apr 22 '13 15:04 d-meiser

I think most of these would have to be implemented as standalone functions.

Apr 22 '13 15:04 ddemidov

I think vex::multi_array should be single-device-only. This way we may easily introduce slicing operations, transpositions, etc.

What do you think of the following interface:

std::vector<size_t> dims = {100, 200, 300};

// Construction:
vex::multi_array<double> A(ctx, dims);
vex::multi_array<double> B(ctx, {100, 200});

// Slicing should return multi_array (proxy) of reduced dimensions. No data
// copies should be made; slicing should only affect how the data is indexed
// inside compute kernels:
vex::multi_array::slice s(_, _, 42);
B = A[s];

B = A[slice(_, _, 42)] + A[slice(_, _, 100)];

// or, may be
B = A.slice(_, _, 42);

// An alternative to matlab's (start:stride:end) syntax:
slice s(range(0, 2, 100), _, 1);

// Linear algebra operations, like matrix-matrix or matrix-vector products
// should be supported. Element-wise operations could be achieved by converting
// multi_array to vex::vector:
B.vec() += sin( A[slice(_,_,1)].vec() );

Edit: This is just speculation for now, because I don't currently have time to implement this (may be later this summer).

May 17 '13 04:05 ddemidov

Maybe using

vex::multi_array<T, dim-1> vex::multi_array<T,dim>::operator[](size_t)
vex::multi_array<T, dim> vex::multi_array<T,dim>::operator[](_)

like boost::multi_array does would make the code more compact:

vex::multi_array<double,3> A(ctx, {200, 200, 200});
B = A/*i=0, d=3*/[_]/*i=1, d=3*/[_]/*i=2, d=3*/[42]/*d=2*/;
B = A[_][_][42] + A[100]/*d=2*/;
B = A[_][100][_];
B[10] = A[1][2];
double b = B[10][20];

May 17 '13 12:05 neapel

Yes, I like this better.

May 17 '13 13:05 ddemidov

I think the repa library in haskell also has lots of good ideas on how to do multi-dimensional arrays in a type safe manner:

http://hackage.haskell.org/package/repa

http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial

The overall idea is rather similar to boost::multi_array. Arrays are represented by a flat chunk of memory that holds the data and a type that represents the shape of the array. There are functions that just manipulate the shape, i.e. just type manipulations. What they're really strong at is loop fusion optimizations.

I'd love to contribute to this but I'm pretty swamped at work at the moment. At a minimum I could write tests though.

May 17 '13 14:05 d-meiser

Here is an example of a dense matrix-matrix product implementation:

vex::vector<double> A(ctx, n1 * n2);
vex::vector<double> B(ctx, n2 * n3);
vex::vector<double> C(ctx, n1 * n3);

// C = A * B
C = vex::reduce<vex::SUM>(
        vex::extents[n1][n2][n3],
        vex::reshape(A, vex::extents[n1][n2][n3], vex::extents[0][1])
        *
        vex::reshape(B, vex::extents[n1][n2][n3], vex::extents[1][2]),
        1
        );

This of course is equivalent to a naive algorithm; no tiling or shared memory tricks are involved.

Nov 21 '13 12:11 ddemidov

Nice paper: A C++11 implementation of arbitrary-rank tensors for high-performance computing.

Apr 17 '14 15:04 ddemidov

BTW Denis, clMagma is an OpenCL implementation of LAPACK/BLAS, If we do take EIgen, then we can port clMagma into that (along with ViennaCL) and hence into VexCL. I don't think it would be prudent to come up with another dense matrix interface and reinventing the wheel?

May 04 '14 18:05 skn123

I still don't understand what do you mean by If we do take Eigen. VexCL is not built on top of Eigen, and VexCL can not interface with Eigen (except for copying data back and forth through raw pointers).

May 04 '14 18:05 ddemidov

Another step to better dense matrix support is 3e452b7. It implements a tensordot() operation, borrowed from python's numpy.tensordot. The dense matrix-matrix product implemented above with reshape/reduce combination becomes

vex::slicer<2> Adim(vex::extents[n1][n2]);
vex::slicer<2> Bdim(vex::extents[n2][n3]);
C = vex::tensordot(Adim[_](A), Bdim[_](B), vex::axes_pairs(1, 0));

which is about 30 times faster on my K40c (performance test is given here).

Mar 02 '15 19:03 ddemidov

By the way, for your reference, there has been lately a lot of development in Eigen for multi-dimensional tensors: https://bitbucket.org/eigen/eigen/src/373864b2f85df38592d4d0a5be1d4ef3064d5554/unsupported/Eigen/CXX11/src/Tensor/?at=default

May 11 '16 15:05 ilyapopov

vexcl vexcl copied to clipboard

Dense matrix support

vexcl
vexcl copied to clipboard