libcudacxx issues

Results 117 libcudacxx issues

Sort by recently updated

Soundness bugfix for barrier<thread_scope_block> on sm_70

For sm_70, barrier arrive has an optimization to "coalesce" all arrives with the same update to the same barrier into a single update performed by a "leader" thread. This optimization...

gonzalobg

Adding `mdspan` reference implementation

* Pulls the mdspan reference implementation from branch "stable" of the kokkos repo, https://github.com/kokkos/mdspan, up to PR 172. * Uglified internal identifiers and made some naming convention updates.

youyu3

Need docs for `atomic_ref`

We have docs for [`cuda::atomic`](https://nvidia.github.io/libcudacxx/extended_api/synchronization_primitives/atomic.html), but none for `cuda::atomic_ref`. We should fix that. I think this can largely be a copy/paste of the `cuda::atomic` docs as a starting point.

jrhemstad

good first issue

only: docs

Build warnings in `<cuda/std/barrier>`

Having build warnings when `CMAKE_CUDA_ARCHITECTURES=75`: ```bash [ 98%] Linking CXX executable DYNAMIC_MAP_TEST [ 98%] Built target DYNAMIC_MAP_TEST /home/yunsongw/miniconda3/include/rapids/libcudacxx/cuda/std/barrier: In function ‘void cuda::__4::init(cuda::__4::barrier*, ptrdiff_t, cuda::std::__4::__empty_completion)’: /home/yunsongw/miniconda3/include/rapids/libcudacxx/cuda/std/barrier:158:155: warning: unused parameter ‘__completion’ [-Wunused-parameter]...

sleeepyjack

Consistent behavior of __CUDA_MINIMUM_ARCH__ macro

Currently `__CUDA_MINIMUM_ARCH__` expands to the same value as `__CUDA_ARCH__` on nvcc: https://github.com/NVIDIA/libcudacxx/blob/d553734e66a999727e7b9e6bb19ce7b38024a19f/include/nv/detail/__target_macros#L103 On nvc++ however, the same macro expands to the minimum target architecture provided by the compilation flags: https://github.com/NVIDIA/libcudacxx/blob/d553734e66a999727e7b9e6bb19ce7b38024a19f/include/nv/detail/__target_macros#L76...

sleeepyjack

WIP: Docker refactor, parameterizes OS and compiler versions.

Ideally we instead generate one layer that handles all dialects. CMake demands a CUDA Toolkit be present, currently internal builds of the runtime do not satisfy that requirement so we...

wmaxey

Bugfix/host streaming tests

Delete host stream insertion tests under chrono. Mark a couple tests that include unsupported headers as unsupported on nvrtc.

wmaxey

Added support for most of <mutex>

Has mutex, timed_mutex, once_flag, call_once, unique_lock, scoped_lock, and varied free functions that go with them. Excludes only condition variable support and the recursive versions of mutex.

ogiroux

enhancement

P1: should have

Implement `std::mutex`

brycelelbach

enhancement

P1: should have

Make `cuda::aligned_size_t` available in a more appropriate header than `<cuda/barrier>`

The [`cuda::aligned_size_t`](https://github.com/NVIDIA/libcudacxx/blob/4f42427dfe5fd88672c29c279637e0ccf5b47478/include/cuda/std/barrier#L28-L36) type is currently defined in ``. This requires me to include `` any time I wish to use `cuda::aligned_size_t`. This is especially problematic as merely including `cuda/barrier` prevents...

jrhemstad

libcudacxx
libcudacxx copied to clipboard

Metadata

Soundness bugfix for barrier<thread_scope_block> on sm_70

Adding `mdspan` reference implementation

Need docs for `atomic_ref`

Build warnings in `<cuda/std/barrier>`

Consistent behavior of __CUDA_MINIMUM_ARCH__ macro

WIP: Docker refactor, parameterizes OS and compiler versions.

Bugfix/host streaming tests

Added support for most of <mutex>

Implement `std::mutex`

Make `cuda::aligned_size_t` available in a more appropriate header than `<cuda/barrier>`

← Metadata

Owner

Metadata

libcudacxx libcudacxx copied to clipboard

Metadata

← Metadata

Owner

Metadata

libcudacxx
libcudacxx copied to clipboard