Chris Elrod

Results 832 comments of Chris Elrod
trafficstars

> so I think `Expronicon` is only useful if you will need to manipulate `Expr`, but since you are using `generated function` then prob it is not the case then....

The primary reason it doesn't add `@simd` is because it must be added to the innermost loop. It's easy to just add `@inbounds @fastmath`, but `@simd [ivdep]` would require parsing...

`ivdep`'s non-aliasing guarantees are unfortunately extremely strong. The reason we get the wrong answer above is because `BitArray` elements alias themselves: the individual bits of the `BitArray` are part of...

But a motivation for `@ivdep` is that it really does help a lot of the time, e.g. my `Octavian` PR defines a `PointerMatrix` type that needs `@simd ivdep` to SIMD:...

I [added benchmarks](https://chriselrod.github.io/LoopVectorization.jl/latest/examples/filtering/) over a size range of 2:256 for `out` using double precision. The unrolled `avx` function was just like `avx2dunrolled!` above, meaning I did not add a `tile=`...

I made a couple changes and updated the benchmarks. I'm now using [a hack](https://github.com/chriselrod/VectorizationBase.jl/blob/master/src/vectorizable.jl#L444) to mark the arrays as not aliasing one another. This works by defining an LLVM function...

Thankfully, I found a way to remove the hack this morning. I am rerunning all the benchmarks. (When they're done, I'll issue new releases of VectorizationBase followed by SIMDPirates, and...

[Latest filtering benchmarks](https://chriselrod.github.io/LoopVectorization.jl/latest/examples/filtering/): LoopVectorization is now as fast with a dynamically sized kernel matrix as the others are with statically sized. It's much faster when it also has a statically...

LoopVectorization now achieves roughly 45-50 double precision GFLOPS on my home computer with dynamically sized kernels over the size range of about N = 40 x 40 through 220 x...

A lot of universities are abruptly changing to online-only classes, which must be quite a transition. > This is unbelievably impressive!! I have to start playing with this pronto...which means...