AutoKernel icon indicating copy to clipboard operation
AutoKernel copied to clipboard

支持多核吗?怎么支持

Open sumo2017 opened this issue 4 years ago • 1 comments

如题,如果要编译支持多核(多core)的算子,应该怎么做?

sumo2017 avatar Aug 09 '21 08:08 sumo2017

可以通过parallel()调度原语来调用: https://autokernel-docs-en.readthedocs.io/en/latest/tutorials/halide/halide_schedule.html#schedules-within-stages-domain-order

parallel (x,size) | Splits the dimension by size and parallelizes the outer dimension.

如果要控制具体的线程数,可以通后环境变量HL_NUM_THREADS:

HL_NUM_THREADS=... specifies the number of threads to create for the thread pool. When the async scheduling directive is used, more threads than this number may be required and thus allocated. A maximum of 256 threads is allowed. (By default, the number of cores on the host is used.)

lyuchuny3 avatar Aug 13 '21 06:08 lyuchuny3