Mathieu Poumeyrol
Mathieu Poumeyrol
What this already PR does: * creates armv6vfp kernel set (sgemm only, adapted from an orphan semi working kernel in armv7a) * creates arm32vfp configuration using that kernel What I...
I think one of the issues here is that in the x86 world, the floating point unit (or simd extensions) are part of the microarchitecture definition. When you say "haswell"...
I don't mind considering alternatives. I have only been a contributor for a short period of time, and there is so much I don't know... My proposal was mostly meant...
Hey @fgvanzee , sorry for the delay, I was trying to figure out what I really need. Ideally, what I would like to have in one single build, with cpuid-like...
Good to know that it makes sense for arm specialists. I'm pretty busy with other stuff right now and prefer not to loose too much focus, and as it looks...
@koallen Mmm I'm not aware of a "single precision" limitation for armv7 neon. But it would not be vector anymore as the registers are 64bits... which may just explain why...
@fgvanzee, as discussed, I added the armv7 neon kernel set renaming on top of the PR. I'm no longer 100% sure what adding a arm32neon family would accomplish, so I...
Well, I just fixed a remaining oopsie that should not have been committed, but I think it's now in good shape, and offers a net gain on armv6 platforms with...
@fgvanzee no problem.
Hey! Well, I've done as much as I can on this issue at this point. I am no arm expert, so having somebody review this would be great. The late...