Chris Elrod

Results 843 comments of Chris Elrod

Perhaps we should do this automatically instead, with `Base.promote_op(f, somesimdtypes...) !== Union{}`

The code here shouldn't be that hard to read. You're welcome to try and tackle it.

I may respond with more later, but you can get some ideas for more tricks here: https://github.com/PumasAI/SimpleChains.jl/blob/main/src/forwarddiff_matmul.jl Long term, the rewrite should "just work" for duals/generic Julia code. However, it...

This may be a good first issue for someone who wants to give this a try. ```julia julia> using LoopVectorization julia> function swap_debug(v, w) LoopVectorization.@turbo_debug for i in axes(v, 1)...

Hmm, I cannot reproduce... ```julia julia> @benchmark reflectorApply!($x, $τ, $y) BenchmarkTools.Trial: 10000 samples with 171 evaluations. Range (min … max): 630.281 ns … 855.865 ns ┊ GC (min … max):...

Mind showing me the ```julia @code_typed reflectorApply!(x, τ, y) ``` and ```julia @code_native debuginfo = :none syntax = :intel reflectorApply!(x, τ, y) ``` ```asm .text .file "reflectorApply!" .globl "julia_reflectorApply!_4509" #...

There's basically only one change between 0.12.118 and 0.12.119: https://github.com/JuliaSIMD/LoopVectorization.jl/compare/v0.12.118...v0.12.119 so if the change were in LoopVectorization.jl itself instead of a dependency, there is only one place I have to...

From your code: ```asm movabs rax, offset StrideIndex vzeroupper call rax mov r13, qword ptr [r13] mov rax, qword ptr [rsp + 96] add rax, r13 mov qword ptr [rsp...

What do you get for `] st -m ArrayInterface`?

I'm guessing it is because `StrideIndex` is not inlining