iree
iree copied to clipboard
Compiling for llvm-cpu without targeting a specific CPU is a bad experience
I've seen multiple people falling down this hole: they run iree-compile on their model, targeting CPU. Then they get performance that is 10x-100x off of any reasonable expectation. Then they either go away silently or report back about poor experiences (not always reporting flags and such).
There are good reasons why a compiler like IREE shouldn't make assumptions about what the CPU target is, but on the other hand, it will almost always produce a a grossly subpar experience to not specify a target CPU, since the generic target (at least on X86) lacks so many features as to be basically useless for any high performance numerics.
I've even fallen down this hole recently and had to go remember the incantation to select a specific CPU. In the case I was working on (an f16 CPU LLM), performance was 100x different between not specifying a target CPU and specifying "host". We need to guide people better than this.
Proposal
As mentioned, there are good reasons for a compiler to not make too many assumptions without being told what to do. But I think we can/should actively warn, possibly with a link to the documentation site when the compiler is invoked without the user specifying a target CPU. Whatever the warning is should be very explicit that the user should pass --iree-llvmcpu-target-cpu=host
to target the precise CPU they are running on. We should possibly also accept "generic" or something like that for if the user really wants to target the default and not get the warning. I basically want to guard against the case where the user has not specified anything and the compiler just silently generates 100x too slow of code. In almost all cases, it will be better for the user to say something and we should guide them on a proper choice.