Delchini Marco
Delchini Marco
@rhornung67 thanks for your email. In the example you provided, the function is defined in the RAJA space: ``` RAJA_INLINE RAJA_DEVICE Real_type trap_int_func(Real_type x, Real_type y, Real_type xp, Real_type yp)...
In the above example, the variables `RAJA_DEVICE` and `RAJA_INLINE` precede the definition of the C++ function. If I were to run the same C++ function on a CPU machine, I...
Hi @artv3 thanks for your reply. This is what I am after. I just started with RAJA and may have overlooked a few things in the documentation. If I understand...
Ok, perfect let me try this and see if I can get it to compile and run. Thanks for taking the time to reply and to provide that working example....
Hi, I was able to compile and the function from both the CPU and GPU using the following syntax: ``` __device__ __host__ void foo() { printf("Hello world"); } call foo();...
Hi @rhornung67 and @artv3, thanks for the reply. The piece of code I provided does not include the RAJA Views. Below is the full code: ``` RAJA::View lgclbool_(lgclbool, 3); RAJA::View...
@artv3 I have not looked at the block mapping yet. I spent quite a bit of time learning nested blocks and making sure the code compiles and gives the correct...
@artv3, thanks for providing the code. I was looking at it and noticed that some of the options alike `cuda_launch_t` or `cuda_global_thread_x ` are not in the documentation. Are these...
@artv3 in the post where you converted the code from C++ to RAJA, you left the loop over the direction unchanged. ``` for (int a = 0; a < ndir;...
Hi @artv3, if I understand you correctly, I can keep the C++ for loop syntax as it is. Does that mean each GPU thread will perform a for loop over...