skn123 comments

Results 41 comments of


                                            skn123

Dense matrix support

BTW Denis, clMagma is an OpenCL implementation of LAPACK/BLAS, If we do take EIgen, then we can port clMagma into that (along with ViennaCL) and hence into VexCL. I don't...

Run "mba" in OpenCL

Denis, The setup was done using lot of C++11 code :) That's why I was having a hard time understanding it, but I have a suspicion that it can be...

Run "mba" in OpenCL

Why not take the FFT route? Handcraft those kernels for 1,2 and 3D and make it work on the device for these 3 dimensions. For all other dimensions, it can...

Run "mba" in OpenCL

If you are referring to the basic BA algorithm in the paper, then there are loops that can easily be "OpenMP-fied" even in the current implementation.

Run "mba" in OpenCL

If I look at BA algorithm in the paper, there are three main loops. If I understand correctly, the innermost loop can be parallelized using Boost-Compute (or even Bolt) as...

I have another question regarding mba_benchmark,cpp vex::multivector C(ctx, n); vex::vector Z(ctx, n); ``` vex::copy(x, C(0)); vex::copy(y, C(1)); prof.tic_cl("interpolate"); for(size_t i = 0; i < m; ++i) Z = surf(C(0), C(1));...

Run "mba" in OpenCL

Yes, indeed that is the case of (for each point ( x_c , y_c , z_c) in P do) and that is what I meant by "data parallelism". If I...

Run "mba" in OpenCL

Perfect, then the only thing that would be of interest in this implementation would be the aspect of data parallelization. To understand this concept, take a look at Fig 3...

Run "mba" in OpenCL

So in the mba_benchmark do I explicitly have to state p.clear(), v.clear() and so on...?

Run "mba" in OpenCL

Thanks for the terms :) I always wondered why we cannot use GPU for coarse grained parallelism. However, the parallelism I am hinting at can (??) be achieved by the...