stdarch
stdarch copied to clipboard
Rust's standard library vendor-specific APIs and run-time feature detection
Add intrinsics for the amdgpu architecture. I’m not sure how to add/run CI (`ci/run.sh` fails for me e.g. for nvptx because `core` cannot be found), but I checked that it...
Tracking issue: https://github.com/rust-lang/rust/issues/136306 (blocked on https://github.com/rust-lang/rust/issues/149654) 1. The feature gate of stablized items is `stdarch_neon_fp16`. 2. `stdarch_neon_f16` types are still unstable on arm. They're gated by `stdarch_arm_neon_intrinsics` now. 3. `stdarch_neon_f16`...
Mark the neon intrinics as `inline(always)` now that we can apply the attribute at the call site and perform the checks to ensure that inlining would be correct. See tracking...
Based on the previous discussion in https://rust-lang.zulipchat.com/#narrow/channel/422870-t-compiler.2Fgpgpu-backend/topic/Return.20type.20of.20NVPTX.20index.20and.20dimension.20intrinsics r? @workingjubilee
Some aarch64 vendor intrinsics seem to be implemented via `simd_reduce_max`/`simd_reduce_min` on float types (Cc @folkertdev). Is that truly the right thing to do? We currently codegen these to `llvm.vector.reduce.fmax.*`, which...
A total of 897 functions, except for these 44 that explicitly use `f16` in the signature - `_mm{,256,512}_{set,set1,setr}_ph` - `_mm_set_sh` - `_mm{,256,512}_{load,loadu,store,storeu}_ph` - `_mm_{load,mask_load,maskz_load,store,mask_store}_sh` - `_mm{,256,512}_reduce_{add,mul,min,max}_ph` - `_mm{,256,512}_cvtsh_h` - `_mm{,256}_bcstnesh_ps`...
## Summary 1. Changed from `IntrinsicType::target` (String) to `IntrinsicType::metadata` (HashMap) for better support for differing architectures 2. Added `Constraint::Set(Vec)` for support for distinct constant argument values (which may be of...
Currently, we use 2 procedural macros, `assert_instr` and `simd_test`, for testing. It is convenient, but the problem is that it significantly slows down compilation. Seeing that these macros are pretty...