Warn when --iree-llvmcpu-target-cpu defaults to "generic".
Progress on https://github.com/iree-org/iree/issues/18561.
This PR supersedes https://github.com/iree-org/iree/pull/18587. It introduces a new command line flag, --iree-llvmcpu-logging-unspecified-target-cpu, with three values:
- Empty string (the default) preserves the current behavior of silently falling back to
genericwhen the CPU and CPU features are unspecified. "warn"causes helpful messages to be printed on stderr."fatal"also makes that fatal.
Along the way, this PR refactors this code into a separate file, and folds a few improvements from https://github.com/iree-org/iree/pull/18587.
Incremental migration plan:
- Land this PR. No observable difference.
- Locally change the default to
"warn"or even"fatal"and fix all the in-tree instances. - Change the default to
"warn". - Wait > 3 months.
- Debate changing the default to
"fatal", and if so, decide how"fatal"should be implemented. The implementation in this PR with justexit(EXIT_FAILURE)may only be appropriate for a non-default debug flag.
Sample logging with --iree-llvmcpu-logging-unspecified-target-cpu=warn:
When the target triple is x86:
error: Please pass --iree-llvmcpu-target-cpu (or alternatively, --iree-llvmcpu-target-cpu-features, but this message is tailored to the current target architecture, x86, where --iree-llvmcpu-target-cpu is used more often).
Examples:
--iree-llvmcpu-target-cpu=host
Target the host CPU. The generated code will have optimal performance on the host CPU but will crash on other CPUs not supporting the same CPU features.
--iree-llvmcpu-target-cpu=generic
Target a generic CPU for the target architecture. The generated code will have poor performance unless CPU features are also specified by --iree-llvmcpu-target-cpu-features.
--iree-llvmcpu-target-cpu=somecpu
Target the specified `somecpu` (see accepted values below). The generated code will have optimal performance on the specified CPU, but will crash on older CPUs and will have suboptimal performance on newer CPUs.
Accepted CPU values:
nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, arrowlake, arrowlake-s, lunarlake, gracemont, pantherlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, clearwaterforest, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, znver5, x86-64, x86-64-v2, x86-64-v3, x86-64-v4
When the target triple is NOT x86:
error: Please pass --iree-llvmcpu-target-cpu-features and/or -iree-llvmcpu-target-cpu. Outside of x86, --iree-llvmcpu-target-cpu-features is the preferred way of enabling CPU features.
Examples:
--iree-llvmcpu-target-cpu-features=+somefeature1,+somefeature2,...
Enable the comma-separated list of CPU features (each prefixed with a + sign).
--iree-llvmcpu-target-cpu=host
Target the host CPU. The generated code will have optimal performance on the host CPU but will crash on other CPUs not supporting the same CPU features.
--iree-llvmcpu-target-cpu=generic
Target a generic CPU for the target architecture. The generated code will have poor performance unless CPU features are also specified by --iree-llvmcpu-target-cpu-features.
--iree-llvmcpu-target-cpu=somecpu
WARNING: outside of x86, this flag may not actually populate CPU features. It may only serve to select the instruction scheduling model.
WDYT @benvanik @stellaraccident @ScottTodd ?
nit about GitHub: Please put tags like these in comments, not commit messages (or PR descriptions that then become commit messages). If someone pushes a commit containing such a tag to their fork, it will notify those users. I don't really need to know whenever someone rebases or otherwise rewrites history in a fork :P
@ScottTodd , @benvanik , @stellaraccident , I just pushed what I have. It addresses all of @ScottTodd 's earlier review comments, and produces the expected results in iree-compile command-line invocations with target-backends=llvm-cpu. The main problem that I am struggling with is that it also generates the warning message about the generic GPU fallback... for other target-backends. That is because at the moment, the error message is being printed from LLVMCPUTargetCLOptions::getTargetOptions, which is called at target backend registration, and all target backends are registered regardless of which one is actually used.
I don't know how to fix this. @benvanik suggested buildConfigurationPassPipeline, but that too is being called for the LLVMCPU backend in a iree-compile invocation with vulkan-spirv.
After chatting some more with @ScottTodd , pushed a 2nd commit that makes it work AFAICS: https://github.com/iree-org/iree/pull/18682/commits/18748f7bd02e6029e36a56074f9fa6f7a23c4527
WDYT?
After chatting some more with @ScottTodd , pushed a 2nd commit that makes it work AFAICS: 18748f7
WDYT?
That seems ok to me.
@ScottTodd , this feels ready for review now.
I have followed your suggestion to drop TARGET_CPU and TARGET_CPU_FEATURES. While doing do, I found the place which had originally motivated them: this is where we pass a --requirements flag to the test runner so that it can check at runtime that the host CPU supports the expected features. Now that there isn't a separate TARGET_CPU_FEATURES field anymore, we resort to parsing the COMPILER_FLAGS, which is not too bad and is a tiny detail, so it is the right call to deal with it rather than have those TARGET_CPU and TARGET_CPU_FEATURES flags everywhere.
The TARGET_CPU_FEATURES_VARIANTS field used to accept a special value "default" and default to that. I renamed that to "generic" because that's what it does (so let's not have two different names for that same configuration, "default" and "generic") and I added support for "host" so we can easily add tests that test host codegen and/or decide to default to testing that in addition to "generic" in a follow-up.
As part of this CL, I had to add "--iree-llvmcpu-target-cpu=generic" to a number of iree_check_single_backend_test_suite. If those were iree_check_test_suite, this wouldn't be needed, as those take TARGET_CPU_FEATURES_VARIANTS which default to "generic". This discrepancy between iree_check_single_backend_test_suite and iree_check_test_suite seems unfortunate and not reflected in the names. We should address that in a follow-up. My preferred resolution actually would be to drop iree_check_single_backend_test_suite other than as a local implementation helper within iree_check_test.cmake. Everyone should be able to just use iree_check_test_suite, and it's not even more verbose.
I have followed your suggestion to drop
TARGET_CPUandTARGET_CPU_FEATURES. While doing do, I found the place which had originally motivated them: this is where we pass a--requirementsflag to the test runner so that it can check at runtime that the host CPU supports the expected features. Now that there isn't a separateTARGET_CPU_FEATURESfield anymore, we resort to parsing theCOMPILER_FLAGS, which is not too bad and is a tiny detail, so it is the right call to deal with it rather than have thoseTARGET_CPUandTARGET_CPU_FEATURESflags everywhere.
Ah. That's on my list to remove too: https://github.com/iree-org/iree-test-suites/blob/3a0ea13cbdc954365d653a48cc99cc63a9ff09b3/linalg_ops/matmul/generate_e2e_matmul_tests.py#L529-L531
# TODO(scotttodd): drop this and whatever logic in the test tool used it
# multiple backends should be able to use the same input IR, so the
# input IR shouldn't need things like CPU features in it
Also for code review, please try to avoid force pushing. I don't see a way to view the diff between when I last reviewed and the latest code >_>
(GitHub has a few ways to view diffs between commits, but force pushing with a rebase defeats most of them)
- Changes since last review shows no diff
- "Compare" on the force push shows all changes, including commits unrelated to the PR
@ScottTodd , for my education - when my PR has a conflict with main, so I have to rebase it, that in itself seems to require a force-push, right -- so do I correctly understand your suggestion in such cases to refrain from squashing the commits in this PR, so that the force-push of the rebased first commit doesn't prevent you from looking at the other commits on top of it?
@ScottTodd , for my education - when my PR has a conflict with
main, so I have to rebase it, that in itself seems to require a force-push, right -- so do I correctly understand your suggestion in such cases to refrain from squashing the commits in this PR, so that the force-push of the rebased first commit doesn't prevent you from looking at the other commits on top of it?
git merge upstream/main then resolve conflicts should avoid the need to rebase or force push.
TIL, thanks. Because we ultimately rebase-and-squash when merging PRs, I had followed that workflow also locally. It hadn't occurred to me that I could create merge commits locally and on my PR branch, as that would still ultimately get rebased-and-squashed on merging.
@stellaraccident , i'm blissfully unaware of release schedules. If there is any upcoming release, this one is worth having on it.
Build failures on MSVC: https://github.com/iree-org/iree/actions/runs/11401187470/job/31723569010#step:7:7793
@stellaraccident , i'm blissfully unaware of release schedules. If there is any upcoming release, this one is worth having on it.
See the pinned issue: https://github.com/iree-org/iree/issues/18432
@ScottTodd
Build failures on MSVC: https://github.com/iree-org/iree/actions/runs/11401187470/job/31723569010#step:7:7793
https://github.com/iree-org/iree/pull/18830
@stellaraccident , i'm blissfully unaware of release schedules. If there is any upcoming release, this one is worth having on it.
See the pinned issue: #18432
@bjacob I added a bullet to the rolling release notes there:
- Added warnings for when
--iree-llvmcpu-target-cpu=or--iree-llvmcpu-target-cpu-featuresare missing, since the default of "generic" is slow. These flags will be required in the future: 0c6a151
When we next cut a stable release, the release notes on that issue will be included. Feel free to edit the issue if you have other phrasing you want us to use.