stdarch icon indicating copy to clipboard operation
stdarch copied to clipboard

add nvptx_target_feature and use unaligned barrier

Open jedbrown opened this issue 7 months ago • 4 comments

  • Deprecate _syncthreads (the CUDA name) in favor of new _barrier_sync (NVPTX name barrier.sync).
  • The: barrier.sync instruction is equivalent to barrier.sync.aligned prior to sm_70, and will lead to errors/deadlock if passes (such as MIR JumpThreading) lose the aligned property.
  • Since: MIR does not currently have a way to apply something like LLVM's convergent attribute (and because convergent does not preserve alignment, which can be broken by inlining), we cannot prevent loss of alignment, and thus we require target feature sm_70. In short, we cannot prevent miscompilation of aligned barriers without hard-to-specify preconditions.

This requires https://github.com/rust-lang/rust/pull/138689 (for nvptx_target_feature) and fixes https://github.com/rust-lang/rust/issues/137086.

jedbrown avatar May 22 '25 22:05 jedbrown