Offer attributes for controlling loop optimizations
For example, Clang has #pragma loop which allow the programmer to guide loop unrolling, vectorization, and other optimizations. This is useful in high performance code because heuristics are often fallible and nudging the optimizer in the right direction can sometimes squeeze out some more performance (for a particular optimizer version, of course).
In Rust, the natural replacement for a pragma would probably be an attribute.
Previous discussion: https://internals.rust-lang.org/t/loop-unrolling-on-request/3091 (pointed out in #rust-internals by lqd)
In that discussion, the possibility of a (likely procedural) macro for unrolling specifically was brought up. That would be a rather blunt tool, as it wouldn't integrate with the optimizer. It would also not cover use cases for limiting in the optimizer (i.e., preventing unrolling that would normally occur).
It also doesn't address knobs related to other optimization than loop unrolling.
My proposal was meant to integrate with the optimizer. In that proposal #[unroll(never)] is for limiting unrolling, that equals to:
#pragma clang loop unroll(disable)
#[unroll]
That's equal to:
#pragma clang loop unroll(enable)
#[unroll(8)]
That's equal to:
#pragma clang loop unroll_count(8)
#[unroll(try_full)]
That's equal to:
#pragma clang loop unroll(full)
Regarding the knobs for other optimizations, my proposal doesn't prevent them. If you want later you can add other attributes:
#[vectorize_width(2)]
Similar to:
#pragma clang loop vectorize_width(2)
And:
#[interleave_count(2)]
Similar to:
#pragma clang loop interleave_count(2)
The llvm.loop metadata interface itself actually seems like a pretty well-named set of traits. In particular, I like the idea of the names having a sort of namespacing using ., so that the loop-related traits all start llvm.loop. Exposing these directly would introduce an undesirable connection between the frontend and the backend (making alternate backends less viable), but why not simply base the frontend attributes on the LLVM names for now, using namespacing in a similar fashion?
I.e, something like,
#[optimization_hint.loop.<LLVM metadata trait>]
So, for example:
#[optimization_hint.loop.interleave(4)]
I believe this is potentially something for @rust-lang/wg-codegen to weigh in on?
(small nit) In Rust we'd probably use #[optimization_hint(loop(interleave(4)))] instead.
Or #[optimization_hint::loop::interleave(4)], but it'd be a first - currently we don't use the path syntax for builtin attributes, so we ended up with #[repr(align(4))], instead of #[repr::align(4)].
Personally, I think #[foo::bar::baz::<etc>(arg)] would be much better than #[foo(bar(baz<etc>(arg)))<many parens>)))]. After all, we may have quite a few functional features, but this isn't Lisp! Two layers of paren-nesting, as in repr(align(4)), seems fine, but three is pushing it, in my opinion.
I would really love to see this implemented
So, is there any roadmap already for loop unrolling being supported? At least a compiler hint like inline
This needs somebody to author an RFC in order to move forward.