Chris Elrod
Chris Elrod
FWIW, it should look more like this: ```julia julia> VectorizationBase.CACHE_COUNT (18, 18, 1, 0) julia> VectorizationBase.COUNTS Dict{Symbol, Int64} with 19 entries: :L3Cache => 1 :I2Cache => 0 :Package => 1...
Thanks for filling a report there, but the contributors at Hwloc.jl may forward you here: https://github.com/open-mpi/hwloc Note that VectorizationBase 0.12 doesn't support Julia 1.6. Julia 1.6 won't be out until...
Forgot to update to confirm that LoopVectorization's been using 64 as a default fallback for a while now.
Yes, that comment was a little out of date. You're right. It would be great to start testing it on ARM. I'll have to figure out how to do feature...
There's also this for feature detection: ```julia using Libdl llvmlib = Libdl.dlopen(only(filter(lib->occursin(r"LLVM\b", basename(lib)), Libdl.dllist()))) gethostcpufeatures = Libdl.dlsym(llvmlib, :LLVMGetHostCPUFeatures) features_cstring = ccall(gethostcpufeatures, Cstring, ()) features = filter(ext -> (m = match(r"\d",...
I think hard coding the number of registers would be worth it. Currently, LoopVectorization only uses the number of vector registers and just hopes performance isn't hurt too badly by...
I shouldn't assume that. Lots of non-x86 CPUs only have 2 levels of cache, including the M1.
> I'd be happy to help fix this! You're welcome to make a PR! You could set all the corresponding references to a missing cache level to `nothing` when it...
> The functions belong to Base, so obviously that's a no-go. The types belong to Static.jl and SIMDTypes.jl, but those don't really feel like a appropriate places to be defining...
Do any `llvmcall`s work with the debugger / JuliaInterpreter?