Chris Elrod

Results 840 comments of Chris Elrod

I got: ```julia julia> test() ... get_tmp on Vector{Dual} 0 ... get_tmp2 on Vector{Dual} 1168 ... my_get_tmp on Vector{Dual} 0 ``` `get_tmp2`, which calls `ArrayInterface.restructure(::Array, ::ReinterpretArray)` calls `convert(Array, ::ReinterpretArray)`, which...

> Yes we know. But is it possible to solve? Don't make a copy? Why the need for convert? What's wrong with the reinterpret array? Where is restructure being called?...

> Removed the `map!` loop since it seems LLVM decided not to unroll the loop even though it perhaps was advantageous to do so. Could you try un-reverting that change...

Except that there is a compiler issue currently where the conversions are not no-ops, but actually involve separate (stack) allocations and a fully unrolled load/store between them.

I don't know if it makes any guarantees about sign, but it is correct with respect to `Q` being orthogonal, and `Q*R` approximately equaling the original matrix. ```julia julia> m4b...

We can do much better at many different sizes! I benchmarked padded statically sized arrays vs [StaticArrays.jl](https://discourse.julialang.org/t/we-can-write-an-optimized-blas-library-in-pure-julia-please-skip-op-and-jump-to-post-4/11634/4?u=elrod) here. Padding allowed for apply optimized matrix multiplication kernels, which are much faster....

IIRC, I actually started that Julia session with --math-mode=fast, to let StaticArrays use muladd. The benefit in my examples was due to better vectorization. LLVM often did a good job...

Here are three fairly simple possibilities. The first two are more or less identical, except one wraps items in VecElements. The third unsuccessfully tries to encourage the compiler to use...

For 7x7 matrices, the padded version requires 1.14 times more memory (8 rows /7 rows), but multiplying two of them is 7 times faster (61 ns / 8.6 ns). The...

> The memory layout of Vector{SVector{7}} is not independent of the padding which goes in a SVector{7}. If our type SVector{7} has padding, that padding will end up in the...