Chris Elrod

Results 837 comments of Chris Elrod

Ah, yeah, I was lazy in the implementation and just check if it works for `Vec{2,Int}`, not the type it will actually be used with. https://github.com/JuliaSIMD/LoopVectorization.jl/blob/cdd9ef1c0bb42a25e11b770d4a85c1324d9d2b7a/src/condense_loopset.jl#L997 Most of the functions...

You could get it to work with some manual effort. Or, you can make a PR to add explicit `StructArrays` support by wrapping `vmap`. Basically, you'd decompose the `StructArray` into...

It's not about the hardware, but that LoopVectorization.jl's implementation is bad/limited. This is why LoopModels would fix this.

That's it's pipelined division method that mixes divisions with fast-inversions, multiplies, and Newton Raphson. It's the latter that produces NaN.

> For a Navier-Stokes solve, this dispatches to Octavian.matmul! via https://github.com/JuliaSIMD/StrideArrays.jl/blob/d31fb6cf28b0374162202da8c81694f2b704e0f3/src/blas.jl#L6. Feel free to PR it to drop Octavian as a dependency. > For now, I'll just focus on the...

In my opinion, when you're comparing codes with different amounts of allocations, `mean` is the most appropriate. GC produces a skewed distribution, and `median` misses those heavy tails. If micro-optimzing...

> Yes, I accidentally used median, not mean. I meant to copy your idea as is. And I'm editing this blind. I've never made a macro or changed (or one...

A short summary: 1. `minimum` assumes all noise is positive. Wrong when you incur costs that periodically show up as time, as then you can have negative noise as in...

I do support host adding `@bmean`. No need to change how any existing stuff works. > BenchmarkTools itself runs the GC explicitly (see gcscrub) No idea how successful my solution...

This works for me on niri.