mlx icon indicating copy to clipboard operation
mlx copied to clipboard

How to implement custom multi-thread cpu kernels?

Open ZelinMa557 opened this issue 7 months ago • 2 comments

I read the document https://ml-explore.github.io/mlx/build/html/dev/extensions.html#implementing-the-cpu-back-end, and I need to develop a new primitive in my project. I found that the example custom cpu kernel is a single thread one, how can we implement a multi thread cpu kernel? Is there any example?

ZelinMa557 avatar May 15 '25 11:05 ZelinMa557

As you do mention there is no out-of-the-box multi-threading support currently in the CPU backend. You can deal with this in the following 2 ways:

  1. Utilize multiple streams and deal with multi-threading at the op level
  2. Make a static thread pool for your primitive and submit tasks there yourself and the primitive will wait for their completion.

Let us know if you need more help on how to do either.

angeloskath avatar May 16 '25 05:05 angeloskath

As you do mention there is no out-of-the-box multi-threading support currently in the CPU backend. You can deal with this in the following 2 ways:

  1. Utilize multiple streams and deal with multi-threading at the op level
  2. Make a static thread pool for your primitive and submit tasks there yourself and the primitive will wait for their completion.

Let us know if you need more help on how to do either.

@angeloskath Thanks for your reply! I think a static thread pool will solve my problem. I'm curious about how mlx itself support multi threading cpu kernels. I try to read code at https://github.com/ml-explore/mlx/tree/main/mlx/backend/cpu, but I find that most ops seems run in single thread.

ZelinMa557 avatar May 16 '25 06:05 ZelinMa557