cuda-kat issues

Array.fill with constexpr variable

4

This is a general problem of `nvcc` I would say: ```cpp #include #include constexpr int duzzle = -7; __global__ void kernel() { kat::array arr; arr.fill(duzzle); // fails to compile }...

codecircuit

question

Add <algorithm> and <numeric> functions as thread-level primitives?

While it's rarely a great idea, for the sake of completeness, we may want to have implementations of the Add abstract `` and `` algorithms which could be run by...

eyalroz

question

Rearrange math builtins

"Built-ins" in cuda-kat means those functions which translate into single PTX instructions (not necessarily single SASS instructions though!) We have `on_device/builtins.cuh`, and `on_device/non-builtins.cuh` which contains functions which are builtin-like, or...

eyalroz

Task

Implement a reverse_iterator, enable rbegin(), rend() etc.

We've already added support for some of the `` methods for accessing ranges, like `std::begin()` and `std::end()`. But - we haven't added any of their "reverse" variant,s e.g.` std::rbegin()` and...

eyalroz

Task

Check code coverage

Now that we have (half-)decent unit test coverage (see #24), we should introduce code coverage checks to see how much remains uncovered. This requires: * Getting a coverage-related CMake module...

eyalroz

Task

Support system-wide and block-wide atomics

Beginning with CUDA 10 (or maybe 9?) we have three kinds of atomics: * `atomicFoo()` - atomic w.r.t. other memory access from within the same GPU. * `atomicFoo_system()` - atomic...

eyalroz

enhancement

Task

Shoud we adhere to CUDA's confusing use of "index" and "id"?

5

An index [is](https://www.merriam-webster.com/dictionary/index) either a "list of items" arranged in order, or "a number... used as an indicator or measure", or "a number ... associated with another to indicate... position...

eyalroz

question

shuffles are warp collaboration primitives.

Shuffles are warp collaboration primitives. They should be in namespace `kat::collaboration::warp` - and declared in the warp collaboration primitives header - if only perhaps through an inclusion of another file.

eyalroz

Task

Make kat::tuple compatible with std::tuple

We've adapted a tuple implementation; however, that tuple doesn't know that there's "another tuple" it needs to be compatible with... we _do_ know. So, let's try and make `kat::tuple` usable...

eyalroz

Task

Should we use CUDA's implicit host-device move and forward?

The programming guide [says](https://docs.nvidia.com/cuda/archive/8.0/cuda-c-programming-guide/index.html): > **E.3.14.3. Rvalue references** > > By default, the CUDA compiler will implicitly consider `std::move` and `std::forward` function templates to have `__host__ __device__` execution space qualifiers,...

eyalroz

question

cuda-kat
cuda-kat copied to clipboard

Metadata

Array.fill with constexpr variable

Add <algorithm> and <numeric> functions as thread-level primitives?

Rearrange math builtins

Implement a reverse_iterator, enable rbegin(), rend() etc.

Check code coverage

Support system-wide and block-wide atomics

Shoud we adhere to CUDA's confusing use of "index" and "id"?

shuffles are warp collaboration primitives.

Make kat::tuple compatible with std::tuple

Should we use CUDA's implicit host-device move and forward?

← Metadata

Owner

Metadata

cuda-kat cuda-kat copied to clipboard

Metadata

← Metadata

Owner

Metadata

cuda-kat
cuda-kat copied to clipboard