Chris Elrod

Results 832 comments of Chris Elrod
trafficstars

On the local versions (I plan to push by the end of the day): ```julia julia> using LoopVectorization, ForwardDiff, BenchmarkTools julia> using ForwardDiff: Dual, Partials julia> using LoopVectorization: SVec julia>...

> And, here's what happens with inv on my computer (an i7-8700), @code_native differs at length 8 but not length 4: Interesting. I see the same thing you do: ```julia...

It's insidious! It's one of the first things I look for when I run into surprising amounts of allocations. > And thanks, deleting that unused where Z fixes this example...

Okay. You're welcome to make PRs there yourself. It may be a couple months before I work on it myself.

In terms of data layout, I think it is likely that a "struct of arrays" representation would work best. Preferably, rather than creating three separate arrays, the last axis of...

> Interesting. We got away from this layout because I had the impression it would be better to have all the values localized, but it would be easy to get...

> This prompts me to realize the above should probably be 4 layers of nesting: we probably need something to say to the compiler that dimension 3 of `rgbchannels` has...

> Instead of specializing for RGB colors, how about a general interface for any `isbits` type convertible to `NTuple{N, Union{Float32, Float64}`? That covers a lot of use cases (complex numbers,...

I demoed rewriting the IR into loops performing a reverse pass last year. I think reviving that effort and adding support for some AD system would be another great use...

> I'm wondering would you consider generate the loop IR from inferred Julia SSA IR? this would give you type information to let you handle composite types. LoopVectorization already has...