Alex Wells
Alex Wells
Historically with vectorizing compilers, a data type can have its members turned into scalars with a "scalar replacement of aggregates" optimization pass. Once a data type has become scalars, its...
@chellmuth , as an experiment could you add a no-inline attribute to the layer functions to prevent them from being inlined? Not sure you really want to do that, but...
Hold up on this, going to update so that such that the work to "borrow" a shading context for opt/jit is internal to ShadingContext and not burden on its user...
Ok Larry, this is ready for review
@johnfea thanks, I think the question (possibly unanswered) was: With many of the noise and other functions doing internal SIMD using x,y,z or r,g,b,a to take advantage of SSE would...
@johnfea great to hear it is speeding things up. Curious how Batched does with AVX on the same workloads vs Batched with SSE. As far as backfacing() not working, it...
For the Files Changed it all looks good, just adds a 4 wide path utilizing all the same approaches. Want to see the CI run it though and make sure...
@johnfea , can you elaborate or provide example of "old and new non-typical width batched code in llvm_util.c isn't covered by testsuite though."
Looking at CI action, I see [VFX2021 gcc9/C++17 llvm11 py3.7 exr2.5 oiio2.3 sse2 batch-b4sse2](https://github.com/AcademySoftwareFoundation/OpenShadingLanguage/actions/runs/10418827805/job/28866184078?pr=1825#logs) which successfully executed in (SSE2 batch width 4) 60 different *.regress.batched.opt tests comparing results against scalar...
@johnfea , I got it, so CI doesn't execute all combinations of 4,8,16 and SSE, AVX, AVX2, AVX512 ISA's so portions of llvm_util.cpp maybe untested. 1. I do think llvm_util.cpp...