Michal Babej
Michal Babej
> So IIUC, the mapping between threads and accessed memory locations is not likely to stay constant across kernel launches no matter what, right? with the current driver, yes that's...
@jrprice @inducer when you have time, perhaps try this branch on a NUMA machine: https://github.com/franz/pocl/tree/pthread_experimental_scheduler Affinity enabled by default, kernel WItems are split pre-launch and given to threads (instead of...
@jrprice @inducer i realized the main problem with NUMA probably isn't the scheduling of threads. If i'm not mistaken, working on remote memory is expensive, and since pocl uses simple...
@pjaaskel the main problem i did not realize back then is that to make intelligent decisions about scheduling on a NUMA machine, the scheduler needs to have knowledge of the...
> Yes, you need to know which regions each WG access if you want to optimize the cache footprint in that case. @pjaaskel what i meant is, for a NUMA...
@inducer threads can be pinned, but there is no deterministic behaviour right now - it's a simple "pull" model.
Pocl currently has Travis CI (though only MacOS X, and it's been quite unmantained), and Drone CI (i try to maintain this one), which has AMD Ryzens and some ARM64...
I think nobody has tried cross-compiling for ages, it's probably broken. Native build on ARM(64) should just work these days. But you'll have to specify LLVM CPU: ``cmake -DLLC_HOST_CPU=cortex-a53`` replace...
> The error seems to be with LLVM link. That usually means you're using a LLVM which wasn't compiled for your system. Are you using LLVM downloaded from llvm.org ?
@petecoup cross compiling is likely entirely broken. @georgebola is slightly mistaken WRT cross-compiling, because -DOCS_AVAILABLE=0 has nothing to do with cross-compilation; it simply compiles pocl without LLVM (natively, not cross),...