LoopVectorization.jl
LoopVectorization.jl copied to clipboard
Macro(s) for vectorizing loops.
Can i use this package with while loop?
First of all, great package! I was trying to implement a slight alteration to the matrix multiplication example from the `Readme.md`: ```julia function custom_gemm!(C::Matrix{T}, A::Matrix{T}, B::Matrix{T}, b::Vector{T}) where { T...
The `mul_trace` function gives uninformative error when called with `Float32` matrices. ``` mul_trace: Error During Test at /home/runner/work/ReactiveMP.jl/ReactiveMP.jl/test/algebra/test_helpers.jl:78 Test threw exception Expression: ReactiveMP.mul_trace(A, B) ≈ tr(A * B) UndefVarError: ####op#279__0...
This is a fantastic package. I've discovered that it, like Julia in general, comes with a hidden cost, "time to first plot." My [package](https://github.com/droodman/WildBootTests.jl) currently uses @tturbo in half a...
to reproduce ```julia function simpile_tm(A) A_dag = permutedims(A, (2, 1, 3)) O = reshape(Consts.O, ntuple(_->2, 8)) lhs = ntuple(_->size(A, 1), 4) rhs = ntuple(_->size(A, 2), 4) shape = (lhs..., rhs...,...
As far as I understand, LoopVectorization will parse a Julia `Expr` to an internal IR which is the `LoopSet` (or something other types in https://github.com/JuliaSIMD/LoopVectorization.jl/blob/f2e60386486074140c1eeb4fe5a87b2b330997b8/src/modeling/graphs.jl#L1) I'm currently trying to simplify...
I'm working on a packed array type, where multiple values are packed in a single byte like `BitArray`, but each element can be two or four bits. Each element is...
This MWE ```julia using LoopVectorization A = [iz*1e2 + iy*1e1 + ix for ix=1:7, iy=1:5, iz=1:6]; sendbuf = zeros(size(A,2), size(A,3)); ix, iy, iz = 2, 1:size(A,2), 1:size(A,3); dst = view(sendbuf,:);...
I am currently running Julia with Threads.nthreads()=10, however @tturbo does not use more than one thread in the code below. The code gets a nice speed up using @turbo, however...
LV has been awesome for effortlessly speeding up code! Thanks! I've recently run into some errors I don't know enough about to fix, reproduced with errors below with version v0.12.96....