Chris Elrod
Chris Elrod
Yeah, I'm still planning on adding support for `rand` through integrating `VectorizedRNG` someday. But, what exactly do you want to do here? That code doesn't work without `@avxt` either.
Do you need it to be along steps of size `1f-3`, or does this work (uniform on the interval)? ```julia julia> using VectorizedRNG julia> m2 = zeros(Float32, 4,4) 4×4 Matrix{Float32}:...
I get ```julia julia> ba = BitsArray{2,2}(undef, 16, 5); julia> stridedpointer(ba) ERROR: conversion to pointer not defined for BitsArray{2, 2} Stacktrace: [1] error(s::String) @ Base ./error.jl:33 [2] unsafe_convert(#unused#::Type{Ptr{UInt8}}, a::BitsArray{2, 2})...
It should probably still be `vmap!`, as someone may want a non-threaded temporal version? Otherwise, sounds good to me.
Related issue: https://github.com/chriselrod/LoopVectorization.jl/issues/102 I think that's a reasonable long-term plan: have `@avx` read functions like `map`, `mapreduce`, `dot`, etc as loops, and let them mix with other such expressions. Currently,...
The problem is the `f.s +=`, not the `ifelse`. ```julia julia> mutable struct foo s end julia> f = foo(0f0); julia> m = rand(Float32, 4,4); julia> s = 0f0; julia>...
I don't see anything particularly problematic in [the diff between Octavian 0.3.5 and 0.3.6](https://github.com/JuliaLinearAlgebra/Octavian.jl/compare/v0.3.5...v0.3.6) compared to what's already there.
> `export JULIA_CPU_TARGET="generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)"` Out of curiosity, what is your host running? On certain systems, it's possible for LoopVectorization to produce code that is invalid/will crash when you try to run...
Thanks for getting a more minimal reproducer.
I get basically the same results on cascadelake as on tigerlake. For reference, `middle` is vectorizing `Ipre#1#`, i.e. it is vectorizing the tuple of length 3. This is pretty wasteful...