ldc icon indicating copy to clipboard operation
ldc copied to clipboard

Missing default LLVM `cpu-features` in some targets

Open kassane opened this issue 7 months ago • 10 comments

Version: v1.41.0-git-20d22b1

During building tests, it's common to find warnings without errors:

'-half-precision' is not a recognized feature for this target (ignoring feature)

or (need LLVM 20 - new cpu: lime1)

'lime1' is not a recognized processor for this target (ignoring processor)
# LLVM 19 - missing features
ldc2 --mtriple=wasm32-unknown-emscripten -mcpu=bleeding-edge -vv                                                                            
Targeting 'wasm32-unknown-emscripten' (CPU 'bleeding-edge' with features '')
# LLVM 19 - missing features
ldc2 --mtriple=wasm32-unknown-emscripten -mcpu=generic -vv                                                                            
Targeting 'wasm32-unknown-emscripten' (CPU 'generic' with features '')
# OK
ldc2 --mtriple=wasm32-unknown-emscripten -mcpu=mvp -vv                                                                            
Targeting 'wasm32-unknown-emscripten' (CPU 'mvp' with features '')

Knowing that it is possible add it manually using the -mattr=+feature_name1,-feature_name2 command. However, it's not all cpu-targets from wasm32 are opaque by default.


Extra info to compare

Rust - wasm32 target-info
rustc --print cfg --target wasm32-unknown-emscripten -C target-cpu=generic
# [...] skip
target_arch="wasm32"
target_endian="little"
# [...] skip
target_feature="multivalue"
target_feature="mutable-globals"
target_feature="reference-types"
target_feature="sign-ext"
# [...] skip
rustc -Vv   
rustc 1.86.0 (05f9846f8 2025-03-31)
binary: rustc
commit-hash: 05f9846f893b09a1be1fc8560e33fc3c815cfecb
commit-date: 2025-03-31
host: x86_64-unknown-linux-gnu
release: 1.86.0
LLVM version: 19.1.7
Zig - wasm32 target-info
zig cc --version
clang version 19.1.7 (https://github.com/ziglang/zig-bootstrap 1c3c59435891bc9caf8cd1d3783773369d191c5f)
Target: x86_64-unknown-linux-musl
Thread model: posix
InstalledDir: /home/kassane/zig/0.14.0/files
zig build-exe --show-builtin -target wasm32-emscripten-none -mcpu=generic
pub const cpu: std.Target.Cpu = .{
    .arch = .wasm32,
    .model = &std.Target.wasm.cpu.generic,
    .features = std.Target.wasm.featureSet(&.{
        .multivalue,
        .mutable_globals,
        .reference_types,
        .sign_ext,
    }),
};

Reference

  • https://github.com/kassane/sokol-d/issues/66

kassane avatar Apr 29 '25 15:04 kassane

@kinke,

Based on the file shown below, CPU parameters are not auto-detected. Alright? https://github.com/ldc-developers/ldc/blob/20d22b1e36242423e073b246a621c46217a92277/driver/targetmachine.cpp#L525-L540

I'm no LLVM expert! However, it's reasonable to know why zig (in bootstrap) run llvm-tblgen to generate --dump-json files from the targets and parse them into its own language so that the compiler can successfully interoperate with LLVM targets. https://github.com/ziglang/zig/blob/a843be44a0cd88d152116a11ba75d059bbb01073/tools/update_cpu_features.zig#L1543-L1558 rust-bootstrap: https://github.com/rust-lang/rust/blob/f97b3c6044a67e0b5d0d0891ca3a6c5d982b2285/src/bootstrap/src/core/build_steps/llvm.rs#L476-L490

  • https://github.com/llvm/llvm-project/issues/28596

kassane avatar May 02 '25 17:05 kassane

IIRC, the printed features are the explicit ones (-mattr), plus one of the few set here implicitly, like that +cx16. AFAIU, a -mcpu selects the min CPU model and so a set of implicit -mattr features for that CPU. I don't see any point in querying and re-adding all of these features explicitly, let alone using llvm-tblgen for such stuff.

I don't understand what the actual problem here is. This?

During building tests, it's common to find warnings without errors: '-half-precision' is not a recognized feature for this target (ignoring feature)

If that is a problem/inconvenience, please post a minimal testcase for reproduction. It's not clear to me what's emitting those warnings, hardly LDC I guess.

kinke avatar May 02 '25 17:05 kinke

If you're interested in seeing the state of all features, then this should help (as pointer towards an implementation): https://github.com/ldc-developers/ldc/blob/20d22b1e36242423e073b246a621c46217a92277/driver/cpreprocessor.cpp#L128-L143

This might be nice for wasm32's ~13 features, but for x86_64 it's a different world, with almost 200 features.

kinke avatar May 02 '25 20:05 kinke

During building tests, it's common to find warnings without errors: '-half-precision' is not a recognized feature for this target (ignoring feature)

If that is a problem/inconvenience, please post a minimal testcase for reproduction. It's not clear to me what's emitting those warnings, hardly LDC I guess.

The warn issued belongs to emcc[^1] (wasm target-feature rename half-precision to fp16). That's because zig cc/clang passes all features with subtraction or addition (if available).

For fix, maybe need downgrade emscripten version.

build commands
#!/usr/bin/env bash

# zig version 0.14.0 (LLVM 19.0.7)
# ldc2 version 1.41.0-beta (LLVM 19.0.7)
# emcc version 4.0.7 (LLVM 21.0.0-git)

# build object files to libsokol.a [1/11 parts]
# same to:
# sokol_app.c, sokol_gfx.c, sokol_time.c, sokol_audio.c, sokol_fetch.c,
# sokol_shape.c, sokol_gl.c, sokol_memtrack.c, sokol_log.c, sokol_debugtext.c
zig clang \
    $HOME/sokold/src/sokol/c/sokol_glue.c \
    --no-default-config -fno-caret-diagnostics \
    -target wasm32-emscripten-none -fno-PIC -flto=full \
    -MD -MV -MF $HOME/sokold/.zig-cache/tmp/df763147ef79b275-sokol_glue.o.d \
    -fhosted -fomit-frame-pointer -fno-stack-protector -fbuiltin -fno-function-sections \
    -fno-data-sections -fasynchronous-unwind-tables -nostdinc \
    -fno-spell-checking \
    -Xclang -target-cpu -Xclang generic \
    -Xclang -target-feature -Xclang -atomics \
    -Xclang -target-feature -Xclang -bulk-memory \
    -Xclang -target-feature -Xclang -exception-handling \
    -Xclang -target-feature -Xclang -extended-const \
    -Xclang -target-feature -Xclang -half-precision \
    -Xclang -target-feature -Xclang -multimemory \
    -Xclang -target-feature -Xclang +multivalue \
    -Xclang -target-feature -Xclang +mutable-globals \
    -Xclang -target-feature -Xclang -nontrapping-fptoint \
    -Xclang -target-feature -Xclang +reference-types \
    -Xclang -target-feature -Xclang -relaxed-simd \
    -Xclang -target-feature -Xclang +sign-ext \
    -Xclang -target-feature -Xclang -simd128 \
    -Xclang -target-feature -Xclang -tail-call \
    -Os \
    -Werror=date-time \
    -isystem $HOME/.cache/zig/p/N-V-__8AAHpwDwAUM-uM9dsnIV3FWct7km9c8J1cz83bRWjp/upstream/emscripten/cache/sysroot/include \
    -DIMPL \
    -DNDEBUG \
    -DSOKOL_GLES3 -c \
    -o $HOME/sokold/.zig-cache/tmp/df763147ef79b275-sokol_glue.o \
    --serialize-diagnostics $HOME/sokold/.zig-cache/tmp/df763147ef79b275-sokol_glue.o.diag

# build object file
ldmd2 \
    -c -w -preview=all -verrors=context -vgc -vtls -Oz -boundscheck=off \
    --enable-asserts=false --strip-debug -vcolumns \
    -of=$HOME/sokold/.zig-cache/o/dc39857083db5ecff0af3b50ca75bda2/clear.o \
    -cache=$HOME/sokold/.zig-cache/o/1c4365d3cdeaa91c2937abd91ec23273 \
    -disable-verify \
    -Hkeep-all-bodies -i=sokol -i=shaders -i=handmade \
    -I./src $HOME/sokold/examples/clear.d \
    -L-allow-undefined \
    $HOME/sokold/.zig-cache/o/8ecd70a7708706038eb372f91515d66d/assert.d \
    -vdmd \
    -Xcc=-v \
    -P-I$HOME/.cache/zig/p/N-V-__8AAHpwDwAUM-uM9dsnIV3FWct7km9c8J1cz83bRWjp/upstream/emscripten/cache/sysroot/include \
    -Xcc=-DIMPL \
    -Xcc=-DNDEBUG \
    -Xcc=-DSOKOL_GLES3 \
    --flto=full \
    -mtriple=wasm32-unknown-emscripten \
    -mcpu=generic 

# get LDC2 object + libsokol
zig build-obj \
    $HOME/sokold/.zig-cache/o/dc39857083db5ecff0af3b50ca75bda2/clear.o \
    $HOME/sokold/.zig-cache/o/dcff641b8293d77057bc750c82c5615c/libsokol.a \
    -OReleaseSmall -target wasm32-emscripten-none -mcpu generic \
    -I $HOME/sokold/.zig-cache/o/221a4f5b338ffa65e90ea182a1cf987a -lc \
    --verbose-cc \
    --cache-dir $HOME/sokold/.zig-cache \
    --global-cache-dir $HOME/.cache/zig \
    --name clear -static \
    --listen=- 

# build wasm (receive LDC2 object + libsokol)
$HOME/.cache/zig/p/N-V-__8AAHpwDwAUM-uM9dsnIV3FWct7km9c8J1cz83bRWjp/upstream/emscripten/emcc \
    -v -Oz -flto --closure 1 -sASSERTIONS=0 \
    -sUSE_WEBGL2=1 \
    -sNO_FILESYSTEM=1 \
    -sMALLOC='emmalloc' \
    --shell-file=$HOME/sokold/src/sokol/web/shell.html \
    -sSTACK_SIZE=512KB \
    $HOME/sokold/.zig-cache/o/0186ed9ba4e85c938f777701bc47fce0/clear.o \
    $HOME/sokold/.zig-cache/o/dcff641b8293d77057bc750c82c5615c/libsokol.a \
    -o $HOME/sokold/.zig-cache/o/a70585ac4a9d4bf40a06ea7a3e8fbfa9/clear.html 

[^1]: version 4.0.x: >= LLVM 20


If you're interested in seeing the state of all features, then this should help (as pointer towards an implementation) [...]

Thanks for mentioning that. It is similar to zig cc/zig translate-c. https://github.com/ziglang/zig/blob/3b3c9d2081118c43311d1af726f575b93a6defdd/src/Compilation.zig#L5578-L5591 In addition to this information, I found that the same does not apply to assembler in llvm/clang (upstream) recently fixed in LLVM21:

  • https://github.com/llvm/llvm-project/issues/97517
  • https://github.com/llvm/llvm-project/pull/100714 (fixed)
  • https://github.com/ziglang/zig/issues/10411 (related, wait llvm-upgrade)

However, this lack of cpu-features doesn't just apply to the wasm target. (for cross-compiling only [pure-D])

ldc2 -mtriple=aarch64-unknown-linux-gnu -mcpu=cortex-a72 -vv

output

Targeting 'aarch64-unknown-linux-gnu' (CPU 'cortex-a72' with features '')
  • https://github.com/rust-lang/rust/issues/125033

kassane avatar May 03 '25 15:05 kassane

During building tests, it's common to find warnings without errors:

'-half-precision' is not a recognized feature for this target (ignoring feature)

or (need LLVM 20 - new cpu: lime1)

'lime1' is not a recognized processor for this target (ignoring processor)

[!NOTE] These warnings only happen during release builds with LTO(full) enabled.

kassane avatar May 03 '25 18:05 kassane

These warnings only happen during release builds with LTO(full) enabled.

Okay that finally makes some sense now. I bet they vanish when using an emscripten with matching LLVM version.

kinke avatar May 03 '25 18:05 kinke

My workaround for ldc2 get [any arch] cpu-features (for sokol-d only) is https://github.com/kassane/sokol-d/pull/67/commits/8a332fbe69776bf4c63130827eb90c8d1fdef7f5 I don't know how to specify it in the dub... 😕

output

for x86_64/native target
ldmd2 -w -preview=all -verrors=context -vgc -vtls -Oz -boundscheck=off --enable-asserts=false --strip-debug -vcolumns -of=/home/kassane/sokold/.zig-cache/o/39ad9355099b21693ab28b95a5f95306/clear -cache=/home/kassane/sokold/.zig-cache/o/1c4365d3cdeaa91c2937abd91ec23273 -disable-verify -Hkeep-all-bodies -i=sokol -i=shaders -i=handmade -I./src /home/kassane/sokold/examples/clear.d -L--no-as-needed -vdmd -Xcc=-v -L-lasound -L-lGL -L-lX11 -L-lXi -L-lXcursor -Xcc=-DIMPL -Xcc=-DNDEBUG -Xcc=-DSOKOL_GLCORE -Xcc=-DSOKOL_DISABLE_WAYLAND
# CPU CONFIG
-mtriple=x86_64-linux-gnu -mcpu=znver3 -mattr=+64bit,+adx,+aes,+allow-light-256-bit,+avx,+avx2,+bmi,+bmi2,+branchfusion,+clflushopt,+clwb,+clzero,+cmov,+crc32,+cx16,+cx8,+f16c,+fast-15bytenop,+fast-bextr,+fast-imm16,+fast-lzcnt,+fast-movbe,+fast-scalar-fsqrt,+fast-scalar-shift-masks,+fast-variable-perlane-shuffle,+fast-vector-fsqrt,+fma,+fsgsbase,+fsrm,+fxsr,+idivq-to-divl,+invpcid,+lzcnt,+macrofusion,+mmx,+movbe,+mwaitx,+nopl,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sbb-dep-breaking,+sha,+shstk,+slow-shld,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+vzeroupper,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves
for wasm32 target
ldmd2 -c -w -preview=all -verrors=context -vgc -vtls -Oz -boundscheck=off --enable-asserts=false --strip-debug -vcolumns -of=/home/kassane/sokold/.zig-cache/o/c35b709b291c0271e0b81643c09b98fe/clear.o -cache=/home/kassane/sokold/.zig-cache/o/1c4365d3cdeaa91c2937abd91ec23273 -disable-verify -Hkeep-all-bodies -i=sokol -i=shaders -i=handmade -I./src /home/kassane/sokold/examples/clear.d -L-allow-undefined /home/kassane/sokold/.zig-cache/o/8ecd70a7708706038eb372f91515d66d/assert.d -vdmd -Xcc=-v -P-I/home/kassane/.cache/zig/p/N-V-__8AAHpwDwAUM-uM9dsnIV3FWct7km9c8J1cz83bRWjp/upstream/emscripten/cache/sysroot/include -Xcc=-DIMPL -Xcc=-DNDEBUG -Xcc=-DSOKOL_GLES3 --flto=full
# CPU CONFIG
-mtriple=wasm32-unknown-emscripten -mcpu=generic -mattr=+bulk-memory,+extended-const,+multivalue,+mutable-globals,+nontrapping-fptoint,+sign-ext

Only macos-m1 need feature rename https://github.com/kassane/sokol-d/pull/67/commits/0adeabbd15889a75268a47c5aefb5abbe3ae9349 to fix:

'+contextidrel2' is not a recognized feature for this target (ignoring feature)

kassane avatar May 03 '25 19:05 kassane

There shouldn't be any need to list the features explicitly (those that are implied by -mcpu), that's what I've been trying to say the whole time. Does it make any difference for your toolchains mix, except for the purely cosmetic -vv output, where we just output the explicit features? E.g., does it work around the LTO warnings when mixing LLVM versions (of the compiler(s) producing the bitcode files, and the lld linker (plugin))? Mixing LLVM versions for LTO isn't really supported; sometimes works when lucky, sometimes breaks (and mostly NOT because of CPU features, but other breaking changes in the LLVM IR).

kinke avatar May 03 '25 20:05 kinke

Okay so what clang e.g. does for x86_64 and -march=nehalem is adding these 2 IR function attributes:

"target-cpu"="nehalem"
"target-features"="+cmov,+crc32,+cx16,+cx8,+fxsr,+mmx,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87"

With LDC's -mcpu=nehalem, we only emit "target-cpu"="nehalem" and don't expand the implicit features explicitly.

I'm not aware of this causing any issues so far, e.g., when building LDC itself with mixed D and C++ (full) LTO. Edit: Oh well, we don't use -mcpu there. Edit2: But still default to "target-cpu"="x86-64" "target-features"="+cx16", whereas clang to "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic".

kinke avatar May 03 '25 20:05 kinke

I'm not aware of this causing any issues so far, e.g., when building LDC itself with mixed D and C++ (full) LTO. Edit: Oh well, we don't use -mcpu there. Edit2: But still default to "target-cpu"="x86-64" "target-features"="+cx16", whereas clang to "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic".

I can't say for sure. But this may not be relevant in cases of bindings/wrapper from D rather than rewriting from D.

Years ago, I had headaches cross-compiling openssl encrypt/decrypt in C without the specific cpu-features [aarch64] in clang affecting performance [between generic x cortex-a72].

I'm thinking of replicating a similar test in ldc2, using an crypto API for mini-benchmarking. Preferably, no D-bindings!

kassane avatar May 03 '25 22:05 kassane