Chris Elrod
Chris Elrod
Anyway, your example compiled and ran for me for all 1000 iterations. I then reran it, and this time it hung on iteration 731. ```julia Loop 731 ^C^C^C^C^C^CWARNING: Force throwing...
That was on an 18 core CPU, but I also have a laptop with an i7-1165G7 so I'll try on that, too. I'll hopefully have time to look at this...
I also have an i7 1165G7: ```julia julia> versioninfo() Julia Version 1.7.0-DEV.1150 Commit a08a3ff1f9* (2021-05-22 21:10 UTC) Platform Info: OS: Linux (x86_64-redhat-linux) CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz...
Anyway, try upgrading to `ThreadingUtilities` version >= `v"0.4.3"`. It works/no longer hangs for me now on that version (`v"0.4.3"` [should be released](https://github.com/JuliaRegistries/General/pull/37426) within a few minutes of me making this...
Hi, thanks for the issue. Currently, it doesn't support multiple return values at all. That is, `sincos` won't work either, where the return value is `Tuple{Float64,Float64}`. Supporting `cis` and `sincos`...
On LoopVectorization 0.8.9, you can now do: ```julia using StructArrays, LoopVectorization function vcis!(y::StructArray{ x = rand(1000); julia> vcis(x) ≈ map(cis, x) true julia> @benchmark vcis($x) BenchmarkTools.Trial: memory estimate: 15.88 KiB...
> Thanks, this is great! I had a look through the changes and I wanted to check my understanding. Is it accurate to say that LoopVectorization now supports multiple return...
Hmm, interesting. Unless I'm missing it, [LLVM doesn't seem to support it with an intrinsic](https://llvm.org/docs/LangRef.html) (as of the yet-to-be-released LLVM-10). But It can be implemented via shufflevectors and zext. Do...
> Even if each individual multiplication does not overflow, the sum does with @avx Yes, like I said, currently `@avx` creates an accumulator of the same type as `promite_type(eltype(a), eltype(b))`....
Here is one possible implementation: ```julia using SIMDPirates @generated function muladd(a::Vec{W1,T1}, b::Vec{W1,T1}, c::Vec{W2,T2}) where {W1,W2,T1