XNNPACK icon indicating copy to clipboard operation
XNNPACK copied to clipboard

[Request] Can't add new parameters to the kernel generated by 6x16-aarch64-neonfp16arith-cortex-a75.S.in

Open WeiMa01 opened this issue 7 months ago • 1 comments

We try to add 3 parameters to the 6x16-aarch64-neonfp16arith-cortex-a75.S.in script, and then the f16-gemm-6x16-minmax-asm-aarch64-neonfp16arith-cortex-a75.S kernel can be modified. There are 3 parameters:

  • size_t index,
  • size_t tile,
  • void* w_head

And we do some modification in the code:

  • add new function (xnn_compute_gemm_fp16) in operator-run.c refer xnn_compute_gemm function,
  • add bool flag(is_fp16_kernel) in reshape_fully_connected_nc function in fully-connected-nc.c, if the flag is true, we will set fully_connected_op->compute[0].task_2d_tile_2d = (pthreadpool_task_2d_tile_2d_t) xnn_compute_gemm_fp16;

Then, we pass the index, tile, w_head parameters by the xnn_compute_gemm_fp16 function to f16-gemm-6x16-minmax-asm-aarch64-neonfp16arith-cortex-a75.S kernel

  • In the assembly kernel, we load the parameters as following code: LDR x8, [sp, 8] //load index MOV x28, x8 LDR x8, [sp, 16] //load tile MOV x19, x8 LDR x8, [sp, 24] //load w_head MOV x21, x8

  • Also, we tried the LDP instruction, but the results are same.

With the above modifications, we cannot get the correct values ​​of the 3 parameters, and we guess that there may be something missing. So we ask for help, thank you very much for your time

WeiMa01 avatar May 23 '25 06:05 WeiMa01

Hi,

Could you print a stack trace at the start of your kernel (or run in a debugger and set a breakpoint there) to see how your kernel is being called, e.g. from where?

Cheers, Pedro

gonnet avatar May 23 '25 08:05 gonnet