Xiao-Yong Jin

Results 40 comments of Xiao-Yong Jin

I should have made it clear. It is the `gcc -O2 -o gxi gxi.c` in `build.sh`. FreeBSD does not have `gcc` in the path and I've been using `gcc9` from...

It looks like the arguments are NOT even passed completely! ``` $ nimble -v nimble v0.11.0 compiled at 2020-01-24 04:41:14 git hash: 4007b2a778429a978e12307bf13a038029b4c4d9 ``` using the same setup as OP...

We probably want a stable API. If the internal changes, we don't want to change users' nimble file. Perhaps supply a list of procedures similar to `paramCount` and `paramStr`, but...

This ensures that `BlasArg` for `caxpyxmazMR_` is private to each thread, even if `device::max_kernel_arg_size()` is 0. It provides backend implementations freedom to choose different ways to pass kernel arguments without...

I've changed `int` to an `enum`, and since it is a custom type, we need a few extra member functions in `kernel_param` so that we can keep `target_device.h` independent of...

This is enum class, now. The names are properly scoped. You actually can't directly use `FALSE`/`TRUE`/`ALWAYS` as it is without the enum class name and double colon, unless we move...

Let me look at the generated code in more details.

This is not ready for merge yet. Just list here for interested people.

It currently only works with Intel's. More information here: https://www.intel.com/content/www/us/en/develop/documentation/oneapi-gpu-optimization-guide/top/openmp-offloading-intro/openmp-compile-and-run.html

There are three reasons. 1. QUDA's `mapped_malloc` currently uses `omp_target_alloc_shared`, which is an Intel extension. 2. Different OpenMP implementation may have different interpretation of the specification, and I spent most...