crystal
crystal copied to clipboard
Expose CPU model (and features?) as compile time flags
In #14393 that adds support for AVR microcontrollers, the target CPU model must be specified (it impacts a lot of thing, from codegen to the linker) and I expose it as a compile time flag, so we can decide to implement something differently depending on the CPU model, for example abstract which pins are available, among other things.
Now, it might be interesting to always expose the CPU model when it has been specified with --mcpu. It might be useful when targeting ARM?
As a complement, or alternative, it might be more useful to expose CPU features for X86 or ARM. For example avx2, cx16, neon or lse.
Of course that would increase the surface of flags even more, but this is already happening in a way that is hardly controllable anyways.
Reference: https://github.com/crystal-lang/crystal/pull/14393#discussion_r1574697868
A measure to somewhat curb uncontrolled grow would be to expose flags with a prefix to clearly designate that it's a cpu flag, for example. For example, the flag for --mcpu=atmega328p could be flag(?:cpu_atmega328p).
I'm not sure if this is really necessary but it might be required if there's any chance of collision beteween flag names derived from different sources.
For reference: currently supported flags based on the target platform.
The feature flags are effectively the compile-time analog of #2824. I suggest naming them like flag?(:x86_has_sse2).
So far I haven't been aware of any LLVM C APIs that expose this information yet, e.g. the fact SSE2 is available in x86 baseline unless you pass a conflicting -mattr. It looks like Crystal itself needs an up-to-date, processed copy of llvm/lib/Target/X86/X86.td from time to time. We could avoid parsing the TableGen files directly by doing something like llvm-tblgen -I ../../../include X86.td --dump-json | jq 'map_values(select(.["!superclasses"]? | index("ProcessorModel")))'. Also for reference, here is the code that defines the corresponding C macros in Clang.
There's LLVMGetHostCPUFeatures() but it only works for when the CPU is set to native, of course.
This is a proof of concept to extract the feature flags from LLVM's TableGen files: (someone on the LLVM Discord server suggested that this is indeed the way to go) https://github.com/HertzDevil/crystal/commit/52dac16e6afb91e34acf9a5a0be4c85b4567d4c4
The default model is generic. The equivalent model for native can be obtained using LLVMGetHostCPUName. The PoC alone is insufficient since that function may return an alias, e.g. apple-m2 instead of apple-a15, yet all the feature flags are defined in terms of the latter.
Note that LLVM also adds some default flags on its own, for example this is where SSE2 is enabled, even though the generic model doesn't declare it.
It looks like LLVMGetHostCPUFeatures isn't implemented on Apple Silicon yet.
Also apparently those flags aren't always forward-compatible: https://github.com/llvm/llvm-project/pull/96246
I'm not sure if we want to use the CPU and feature names as flags directly. Apart from x86_has_sse2 I think the CPU should be exposed as x86_cpu_haswell.
We should also preserve all casing and symbols, i.e. flag?("aarch64_cpu_apple-a15") and flag?("aarch64_has_v8.3a"), since it is easier to cross-reference our flags with LLVM's definitions this way.
Also I guess plain flags are the most backward-compatible option, meaning a 1.0.0 compiler will not error if it sees a flag, but are there any other options in the macro language?