Pekka Jääskeläinen
                                            Pekka Jääskeläinen
                                        
                                    Hi - just wanted to check, what is the status/plan of this PR? It would be great to get this in sooner than later to get it some testing.
Any updates on this? Would be nice to get this cleaned and pulled in.
Thanks @isuruf for working on this useful feature. Did you check our comments in this thread and try to address them?
Fails a conformance test conformance_computeinfo ``` ... Device extension mismatch Extensions only in numeric: Extensions only in string: cl_ext_device_side_abort ... ```
Was this fixed appropriately in upstream?
@MathiasMagnus I don't think proper subdevice support is implemented yet. But you can try to limit the number of threads used via POCL_MAX_PTHREAD_COUNT. Other envs are http://portablecl.org/docs/html/env_variables.html @franz > but...
> the main problem i did not realize back then is that to make intelligent decisions about scheduling on a NUMA machine, the scheduler needs to have knowledge of the...
So, you suggest that simply spreading the WGs to all of the, say, 64 cores of a dual socket 32 core per socket processor is fine default behavior and if...
> ...so as to minimize accesses of memory on other nodes. Cache concerns are far behind in significance. It's included in the data locality problem since "memory on other nodes"...
Not sure what was the conclusion here, but coming to think of it, disabling any kind of WG load balancing (fixed WG id to core mapping proposed in this thread)...