skn123
skn123
BTW Denis, clMagma is an OpenCL implementation of LAPACK/BLAS, If we do take EIgen, then we can port clMagma into that (along with ViennaCL) and hence into VexCL. I don't...
Denis, The setup was done using lot of C++11 code :) That's why I was having a hard time understanding it, but I have a suspicion that it can be...
Why not take the FFT route? Handcraft those kernels for 1,2 and 3D and make it work on the device for these 3 dimensions. For all other dimensions, it can...
If you are referring to the basic BA algorithm in the paper, then there are loops that can easily be "OpenMP-fied" even in the current implementation.
If I look at BA algorithm in the paper, there are three main loops. If I understand correctly, the innermost loop can be parallelized using Boost-Compute (or even Bolt) as...
I have another question regarding mba_benchmark,cpp vex::multivector C(ctx, n); vex::vector Z(ctx, n); ``` vex::copy(x, C(0)); vex::copy(y, C(1)); prof.tic_cl("interpolate"); for(size_t i = 0; i < m; ++i) Z = surf(C(0), C(1));...
Yes, indeed that is the case of (for each point ( x_c , y_c , z_c) in P do) and that is what I meant by "data parallelism". If I...
Perfect, then the only thing that would be of interest in this implementation would be the aspect of data parallelization. To understand this concept, take a look at Fig 3...
So in the mba_benchmark do I explicitly have to state p.clear(), v.clear() and so on...?
Thanks for the terms :) I always wondered why we cannot use GPU for coarse grained parallelism. However, the parallelism I am hinting at can (??) be achieved by the...