Chris Elrod

Results 832 comments of Chris Elrod
trafficstars

> That's actually what `@gflops` does: it first counts all ops, then benchmarks using `BenchmarkTools`. But you're right, in your case it would make much more sense to count ops...

> Can i use this package with while loop? No. If it isn't possible to transform the while loop into a for loop, then the loop is probably too complicated...

It ran into problems because of the expression `v = a * (1 - t * t)`. Neither `a` and `t` depend on any of the loops. This should have...

I can reproduce the crash. But for the inner loop version, `@avx` is >15 times faster for me than the non-avx version.

I found out what is causing the crash. I'll try and fix that soon. > I've heard that this can happen for `fma` That was exactly the problem. This should...

> does not appear to solve the problem on my slow CPU. Okay, checking: ```julia julia> sx = LoopVectorization.SVec(ntuple(Val(4)) do i Core.VecElement(rand()) end) SVec{4,Float64} julia> @code_llvm debuginfo=:none tanh(sx) ``` I...

Great! `tanh_fast` should be the same across Julia versions on the same computer. The definition is simply: ```julia @inline function tanh_fast(x) exp2x = exp(x + x) (exp2x - one(x)) /...

Crash [fixed and tested](https://github.com/chriselrod/LoopVectorization.jl/blob/a2e40aa7b8ac0aac04bd777f4a942a070113a004/test/miscellaneous.jl#L695).

The `splitintonoloop` test doesn't work: ```julia function splitintonoloop(U = randn(2,2), E1 = randn(2)) t = 1 a = 1.0 _s = 0.0 n, k = size(U) @avx for j =...

Simpler example: ```julia function splitintonoloop(E1, n) t = 0.5 a = 1.0 _s = 0.0 k = length(E1); @avx for j = 1:k for i = 1:n v = a...