LoopVectorization.jl icon indicating copy to clipboard operation
LoopVectorization.jl copied to clipboard

Some bugs

Open baggepinnen opened this issue 5 years ago • 2 comments

I was playing around a bit and noticed some bugs. The first two snippets are attempts at MWE of the final snippet which is the desired calculation

function simulate(x0::Float64; T = 10.0, dt = 0.0001, vt = 1.0)
    times = 0.0:dt:T
    positions = zeros(length(times))
    v = 0.0
    a = 0.0
    x = x0
    @avx for ii in eachindex(times)
        x = x + v * dt
        positions[ii] = x/x0
    end
    times, positions
end

simulate(10.0) # UndefVarError, but runs without @avx
function simulate(x0::Float64; T = 10.0, dt = 0.0001, vt = 1.0)
    times = 0.0:dt:T
    positions = zeros(length(times))
    v = 0.0
    a = 0.0
    x = x0
    @avx for ii in eachindex(times)
        t = times[ii]
        x = x + v * dt
        positions[ii] = x/x0
    end
    times, positions
end

simulate(10.0) # StackOverflowError

The final and desired code is the following

@inline friction(v::Float64, vt::Float64) =  v > vt ? -3v : -3vt*sign(v)
function simulate(x0::Float64; T = 10.0, dt = 0.0001, vt = 1.0)
    times = 0.0:dt:T
    positions = zeros(length(times))
    v = 0.0
    a = 0.0
    x = x0
    @avx for ii in eachindex(times)
        t = times[ii]
        a = friction(v, vt) - 100.0*x
        a = - 100.0*x
        v = v + a * dt
        x = x + v * dt
        positions[ii] = x/x0
    end
    times, positions
end

simulate(10.0)

Original code from discourse thread

baggepinnen avatar Jan 11 '20 10:01 baggepinnen

I'm away from home for the weekend, so I can't promise that I'll be able to fix this until I get back. You have t = times[ii], but I don't see where you're using t?

TODO items:

  • [ ] Add support for if/else.
  • [ ] Add support for indexing ranges (define overloads for stridedpointer).

Both of these are currently missing. I'll add them, because they ought to be supported.

However, it isn't really possible to vectorize your examples without taking advantage of some sort of identities. That is, each loop iteration depends on the results of the previous iteration, and you can't re-associate like you can for dot products. For that reason, it cannot calculate multiple loop iterations at the same time using SIMD instructions and still get the correct answer. They have to be computed one after the other here, unless there is a simple way to calculate momentum and position that is independent of the previous position/momentum. Or perhaps only a function of position/momentum W+ loop iterations ago, where W is the SIMD vector width / parallel window size.

If you wanted to do multiple simulations at the same time, because the separate simulations don't depend on each other, you could calculate them simultaneously using SIMD so long as your times vectors are the same length across simulations.

One final point is that your friction function should be written generically enough to support SVec{W,Float64} inputs, eg:

@inline friction(v, vt) =  ifelse(v > vt, -3v, -3vt*sign(v))

friction itself would also currently be treated as an opaque function. I'll add a function that lets you "register" it.

chriselrod avatar Jan 11 '20 23:01 chriselrod

Thanks! My bad, was a bit to eager to see if I could speed it up. Perhaps an error message when loop iterations cannot be reordered and it's easy to detect?

It indeed seems that t is unused.

baggepinnen avatar Jan 11 '20 23:01 baggepinnen