LoopVectorization.jl
LoopVectorization.jl copied to clipboard
Preventing `StackOverflowError` automatically with a `@safe_turbo`?
Hey all,
I am eager to switch from @inbounds @simd to @turbo in the evaluation loops of SymbolicRegression.jl, which is the backend for PySR. You can see my initial pull request here: https://github.com/MilesCranmer/SymbolicRegression.jl/pull/132 which shows the loops I am attempting to speed up.
The way my package works is that the user can pass any binary or unary operator (e.g., +, -, *, /, cos, exp, or any function they define). These operators will be arranged into expressions by a genetic algorithm until a combination is found that matches a relationship in a dataset. The evaluation of each expression needs to be high-performance, and is performed by a loop over an array with the @inbounds @simd macros. This loop is behind a function barrier so that the performance is the same as if I had hard-coded each operator the user passed.
I tried to switch to @turbo; however, I ended up seeing StackOverflowError when testing SpecialFunctions.gamma as an operator. From my look through these issues: https://github.com/JuliaSIMD/LoopVectorization.jl/issues/233, https://github.com/JuliaSIMD/LoopVectorization.jl/issues/232, it seems like certain functions will raise this error if they do not have SIMD implemented.
Now, this is an issue for my package, since the user is allowed (and encouraged) to specify any Julia function they wish as the operator - even complex, branching functions (including any of SpecialFunctions) - and those operators will be used inside these loops. Now, I understand that each operator would need to be implemented as a SIMD operation for @turbo to give a speedup, but I would still like to get the speed for other more standard operators, like +, -, *, /, etc.
I am therefore wondering if it is possible to change @turbo to be robust against StackOverflowErrors, and fall back to a non-SIMD operations? Or, maybe a @safe_turbo macro implemented that could do this? (This assumes I do not know what code the user would use inside each loop at runtime.)
Thanks! Miles
A likely culprit is the assumption that constructors/conversion returns an object of the given type, see https://github.com/JuliaLang/julia/issues/42372
Adding a safe key word arg to @turbo or a @safe_turbo option wouldn't be difficult.
If you want to try:
- Add it to the kwargs here:
https://github.com/JuliaSIMD/LoopVectorization.jl/blob/88dfe72e29a7308d551f894b8cf8cc2ceaf21404/src/constructors.jl#L96-L235
so that
process_argswill recognize it and set it astrueif detected. - Forward the option to
setup_call - Have
setup_callforward it tocheck_args_callhttps://github.com/JuliaSIMD/LoopVectorization.jl/blob/88dfe72e29a7308d551f894b8cf8cc2ceaf21404/src/condense_loopset.jl#L989 Or write a separate function, and do&&in theifstatement there ifsafe=true. - The new function/check should iterate over instructions, and chain
&&onArrayInterface.can_avxfor these functions. Basically, iterate overop in operations(ls),iscompute(op) || continue, and then something like
c = callexpr(op.instruction)
pushfirst!(c.args, ArrayInterface.can_avx)
Chain/intersect all of these in the if, and it'll run the fallback @inbounds @fastmath loop instead of the @turbo loop if any of these can_avx are false. For reference, can_avx:
julia> using ArrayInterface, LoopVectorization, SpecialFunctions
julia> ArrayInterface.can_avx(+)
true
julia> ArrayInterface.can_avx(exp)
true
julia> ArrayInterface.can_avx(gamma)
false
julia> ArrayInterface.can_avx(beta)
false
Some special functions are also written in Julia, so you may be able to get SIMD compatible versions.
You can look at gcd if you want an idea of how to handle while loops, for example: https://github.com/JuliaSIMD/VectorizationBase.jl/blob/46bce4794e71bf06364729e61bffbadc7f48a666/src/special/misc.jl#L160-L184
Perhaps we should do this automatically instead, with Base.promote_op(f, somesimdtypes...) !== Union{}
Thanks, adding a kwarg and using can_avx sounds like a good option! Will try it out.
Made a PR over in #431