ara icon indicating copy to clipboard operation
ara copied to clipboard

Illegal Instruction Warning in riscv_v_first_use_handler

Open mrbilandi opened this issue 1 year ago • 1 comments

Hello

When I am attempting to offload an operation to the Ara (in Cheshire+Ara), the kernel logs a warning about an illegal instruction exception repeatedly in the riscv_v_first_use_handler.

This is the Kernel Log:

[ 749.660683] WARNING: CPU: 0 PID: 93 at arch/riscv/kernel/vector.c:158 riscv_v_first_use_handler+0x16a/0x16e [ 749.663520] Modules linked in: [ 749.664911] CPU: 0 PID: 93 Comm: hal_cpu_server_ Tainted: G W 6.5.0 #3 [ 749.666311] Hardware name: eth,cheshire (DT) [ 749.667110] epc : riscv_v_first_use_handler+0x16a/0x16e [ 749.669199] ra : riscv_v_first_use_handler+0x1c/0x16e [ 749.671176] epc : ffffffff800064ac ra : ffffffff8000635e sp : ffffffc80015be90 [ 749.672374] gp : ffffffff818bb618 tp : ffffffd801e2a700 t0 : ffffffff804f7170 [ 749.673539] t1 : ffffffff814001f8 t2 : ffffffff81400278 s0 : ffffffc80015bec0 [ 749.674610] s1 : ffffffc80015bee0 a0 : 000000000020112d a1 : 0000000000000003 [ 749.675714] a2 : 000000005208a457 a3 : 0000000000000057 a4 : 0000000000000057 [ 749.676397] a5 : ffffffd801fc0000 a6 : 0000003facf59080 a7 : 0000003facf59080 [ 749.676759] s2 : 000000005208a457 s3 : 0000003facce3ea0 s4 : 0000000000000002 [ 749.677110] s5 : 0000003fad522ea0 s6 : 0000000000005a00 s7 : 0000000000006000 [ 749.677477] s8 : 0000000000005d00 s9 : 0000000000006300 s10: 0000000000006600 [ 749.678460] s11: 0000000000006900 t3 : 0000003facf59080 t4 : 0000003facf59080 [ 749.679527] t5 : 0000003facf59080 t6 : 0000000000000000 [ 749.680553] status: 0000000200000120 badaddr: 0000000000009002 cause: 0000000000000003 [ 749.681675] [] riscv_v_first_use_handler+0x16a/0x16e [ 749.683897] [] do_trap_insn_illegal+0x28/0xe8 [ 749.686023] [] ret_from_exception+0x0/0x64

Could this issue be related to the number of elements used during vectorization? If so, is there a limit or configuration needed for Ara? Are there diagnostics or debugging steps I can follow to identify the root cause?

Thanks for your attention

mrbilandi avatar Jan 24 '25 16:01 mrbilandi

I printed out the instruction that caused the trap, and it seems to be 0x5208A457. If I am not mistaken, based on the RISC-V instruction format, it refers to following:

Bits Field Name Value 31:25 funct6 0101001 24:20 rs2 00000 19:15 rs1 10001 14:12 funct3 010
11:7 rd 01000 6:0 opcode 1010111

Where opcode 1010111 and funct3 010 refer to OPMVV. I was wondering if Ara is supposed to support this instruction?

Additionally, in our compiled code, the assembly output contains the "vmerge.vvm" instruction. Is it supported by Ara?

Thanks

mrbilandi avatar Jan 31 '25 11:01 mrbilandi