Results 113 comments of aneshlya

Template functions in stdlib are not fully supported yet but it should work after https://github.com/ispc/ispc/pull/2743 is merged.

> It looks like the code creating a LLVM Function for a template instantiation creates incorrect LLVM IR like this: Have you tried it on top of #2743? It works...

I realized where there problem is: we're setting linkage too early: https://github.com/ispc/ispc/blob/main/src/func.cpp#L1260. At `createLLVMFunction` we can create a function declaration with ExternalLinkage only. The logic regarding function linkage should be...

Foreach index is a varying value so when you do `values[i]` it is exactly what compiler says - varying pointer to uniform float. I think you need a loop with...

The usual approach is to use uniform counter like `baseAddr ` below: ``` uniform float values[]; uniform uint32 baseAddr = 0; foreach(i = 0 ... W) { float v0; float...

The foreach loop counter is basically `programCount*i + programIndex`. `aos_to_soa2` loads two times the `programCount` size values from the given array starting at the given offset, returning two varying results...

@iperov, do you have other questions about aos_to_soa or we can close the issue?

ISPC doesn't use self-hosted runners for macOS anymore.

Yes, we need to add target-specific implementations for ARM (and Xe). We've implemented `dot*` functions to evaluate how users will interact with them, assess their convenience, and determine any additional...

I looked closely to `UDOT`/`SDOT` instructions and they use mixed width vectors. For example: ``` @llvm.aarch64.neon.sdot.v4i32.v16i8( %a, %b, %c) ``` **Return Type**: The result is a 4-element vector of 32-bit...