rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Offer attributes for controlling loop optimizations

Open hanna-kruppe opened this issue 8 years ago • 8 comments

For example, Clang has #pragma loop which allow the programmer to guide loop unrolling, vectorization, and other optimizations. This is useful in high performance code because heuristics are often fallible and nudging the optimizer in the right direction can sometimes squeeze out some more performance (for a particular optimizer version, of course).

In Rust, the natural replacement for a pragma would probably be an attribute.

hanna-kruppe avatar Nov 15 '17 22:11 hanna-kruppe

Previous discussion: https://internals.rust-lang.org/t/loop-unrolling-on-request/3091 (pointed out in #rust-internals by lqd)

hanna-kruppe avatar Nov 15 '17 22:11 hanna-kruppe

In that discussion, the possibility of a (likely procedural) macro for unrolling specifically was brought up. That would be a rather blunt tool, as it wouldn't integrate with the optimizer. It would also not cover use cases for limiting in the optimizer (i.e., preventing unrolling that would normally occur).

It also doesn't address knobs related to other optimization than loop unrolling.

hanna-kruppe avatar Nov 15 '17 22:11 hanna-kruppe

My proposal was meant to integrate with the optimizer. In that proposal #[unroll(never)] is for limiting unrolling, that equals to: #pragma clang loop unroll(disable)

#[unroll] That's equal to: #pragma clang loop unroll(enable)

#[unroll(8)] That's equal to: #pragma clang loop unroll_count(8)

#[unroll(try_full)] That's equal to: #pragma clang loop unroll(full)

Regarding the knobs for other optimizations, my proposal doesn't prevent them. If you want later you can add other attributes:

#[vectorize_width(2)] Similar to: #pragma clang loop vectorize_width(2)

And: #[interleave_count(2)] Similar to: #pragma clang loop interleave_count(2)

leonardo-m avatar Nov 16 '17 10:11 leonardo-m

The llvm.loop metadata interface itself actually seems like a pretty well-named set of traits. In particular, I like the idea of the names having a sort of namespacing using ., so that the loop-related traits all start llvm.loop. Exposing these directly would introduce an undesirable connection between the frontend and the backend (making alternate backends less viable), but why not simply base the frontend attributes on the LLVM names for now, using namespacing in a similar fashion?

I.e, something like,

#[optimization_hint.loop.<LLVM metadata trait>]

So, for example:

#[optimization_hint.loop.interleave(4)]

BatmanAoD avatar May 28 '18 19:05 BatmanAoD

I believe this is potentially something for @rust-lang/wg-codegen to weigh in on?

BatmanAoD avatar May 29 '18 16:05 BatmanAoD

(small nit) In Rust we'd probably use #[optimization_hint(loop(interleave(4)))] instead. Or #[optimization_hint::loop::interleave(4)], but it'd be a first - currently we don't use the path syntax for builtin attributes, so we ended up with #[repr(align(4))], instead of #[repr::align(4)].

eddyb avatar May 29 '18 17:05 eddyb

Personally, I think #[foo::bar::baz::<etc>(arg)] would be much better than #[foo(bar(baz<etc>(arg)))<many parens>)))]. After all, we may have quite a few functional features, but this isn't Lisp! Two layers of paren-nesting, as in repr(align(4)), seems fine, but three is pushing it, in my opinion.

BatmanAoD avatar May 29 '18 20:05 BatmanAoD

I would really love to see this implemented

paulabrudanandrei avatar Jun 24 '24 09:06 paulabrudanandrei

So, is there any roadmap already for loop unrolling being supported? At least a compiler hint like inline

Aki0x137 avatar Jan 24 '25 11:01 Aki0x137

This needs somebody to author an RFC in order to move forward.

tgross35 avatar Jan 27 '25 20:01 tgross35