blitz icon indicating copy to clipboard operation
blitz copied to clipboard

Better Vectorization for GCC

Open citibeth opened this issue 9 years ago • 5 comments

Patrik Jonsson wrote:

One more thing that can be added to the improvements list is better handling of vectorization. The last big update to blitz was when I added support for vectorization by making it more obvious to the compiler when arrays were contiguous. However, this relies on the compiler to do the actual vectorization. The inter compiler was quite good at this, but as far as I remember gcc does not vectorize loops at all. Since the majority of users probably use gcc, this is a substantial disadvantage. If someone wanted to look into ways to add explicitly vectorized operations, that would greatly improve blitz's performance under gcc, I think. That does require diving deep into the guts of the expression template mechanism, though.

citibeth avatar Jan 19 '16 17:01 citibeth

How about even generating intrinsics calls? That way it will always work. That would be something I would like to play with.

maddanio avatar Jan 20 '16 00:01 maddanio

Oh, that's what you meant, so, yeah, would like to look into it, if I find time...

maddanio avatar Jan 20 '16 00:01 maddanio

The trouble with intrinsics is that they're tied to the architecture, so it seems messy to get something that works in general. Although, looking at https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html, they may be generic enough that all you need to specify is the width?

lutorm avatar Jan 20 '16 00:01 lutorm

I think it should be pissible to write specializations for the most important operations and architectures.

Von meinem iPhone gesendet

Am 20.01.2016 um 01:23 schrieb lutorm [email protected]:

The trouble with intrinsics is that they're tied to the architecture, so it seems messy to get something that works in general. Although, looking at https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html, they may be generic enough that all you need to specify is the width?

— Reply to this email directly or view it on GitHub.

maddanio avatar Jan 20 '16 10:01 maddanio

Likely a relevant discussion on a try to replace all "#pragma ivdep" intended for the Intel C++ compiler with corresponding "GCC ivdep" pragma in Blitz++: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60267

slayoo avatar Jan 23 '16 20:01 slayoo