rtm icon indicating copy to clipboard operation
rtm copied to clipboard

Use vectorcall for clang as well

Open logankaser opened this issue 3 years ago • 3 comments

Clang supports https://clang.llvm.org/docs/AttributeReference.html#vectorcall so, it could be used when compiling with clang instead in addition to just MSVC

logankaser avatar Oct 04 '22 19:10 logankaser

Good idea, I hadn't noticed this. There seem to be subtle differences between the default calling convention with clang and vectorcall. Clang will pass up to 8 vectors by value in registers but it doesn't appear to return aggregates by value and I'm not sure if it passes aggregates by value, I'd have to double check again. On the other hand, vector supports up to 6 arguments by value but it handles aggregates better.

We have to measure in a real application what the impact is to make sure it yields a net win.

nfrechette avatar Oct 06 '22 01:10 nfrechette

It says on that page:

Homogeneous vector aggregates of up to four elements are passed in sequential SSE registers if enough are available

But of course for this kind of thing, testing/benchmarking is definitely the way to go

logankaser avatar Oct 12 '22 19:10 logankaser