cuda-kat icon indicating copy to clipboard operation
cuda-kat copied to clipboard

CUDA kernel author's tools

Results 65 cuda-kat issues
Sort by recently updated
recently updated
newest added

We have many functions returning lane ids or numbers-of-lanes. Mostly those use `unsigned`. But for better readability/clarity, I'm thinking of introducing something like: ``` using lane_id_t = unsigned; ``` within...

question
Task

In a 2D or 3D block, the CUDA "thread index" - according to official documentation - is a 3D or 3D entity, while the "thread ID" is its linearization (where...

Task

The test fixtures have improved, and become a bit more flexible and requiring less boilerplate, from one test suite to another. We should use the later ones - currently on...

Task

We have grid-scope action in two forms - at grid stride and at block stride. The block stride action means each block acts on consecutive data. At block-scope - we...

Task

We have many templated functions which make a (potentially) large number of reads or writes to memory, and therefore benefit from coalescing their memory operations. However, most, if not all...

Task

CUDA 9.0 was release in September of 2017 - 2.5 years ago. It changed the interfaces of some functions and related PTX instructions. Mostly, `.sync` versions of these were now...

question

The following PTX instructions don't have wrapper functions (nor `builtins::` templated functions where relevant). Add them! * [ ] `lop3` - Logical operation on 3 operands using an immediate 3-parameter...

Task

At the moment, our effective definition of a "builtin" function is one that produces a single PTX instruction (when inlined); and this definition is not even entirely consistent in our...

Task

While nVIDIA's own C headers for builtin wrappers use the fundamental types `int`, `unsigned`, `unsigned long long` etc. - the builtins are actually based on exact parameter sizes, not the...

question

`unaligned.cuh` is missing the `align_down()` function. Either add it or drop `unaligned.cuh`.

fixed on development