XNNPACK icon indicating copy to clipboard operation
XNNPACK copied to clipboard

Why is Signal 7 reporting an error on the armv7a platform TEST (F16_VCMUL_NEONFP16ARITH_U8, batch_lt_8) ?

Open JamesWang2007 opened this issue 11 months ago • 3 comments

Hello, currently in armv7a platform test case f16-vcmul-test, some test functions may crash with Signal 7 error, but I am not sure if it is a bug. Please help me. TEST(F16_VCMUL__NEONFP16ARITH_U8, batch_lt_8) TEST(F16_VCMUL__NEONFP16ARITH_U8, batch_gt_8) TEST(F16_VCMUL__NEONFP16ARITH_U8, inplace_a) TEST(F16_VCMUL__NEONFP16ARITH_U8, inplace_b) TEST(F16_VCMUL__NEONFP16ARITH_U8, inplace_a_and_b) TEST(F16_VCMUL__NEONFP16ARITH_U16, batch_lt_16) TEST(F16_VCMUL__NEONFP16ARITH_U16, batch_gt_16) TEST(F16_VCMUL__NEONFP16ARITH_U16, inplace_a) TEST(F16_VCMUL__NEONFP16ARITH_U16, inplace_b) TEST(F16_VCMUL__NEONFP16ARITH_U16, inplace_a_and_b) TEST(F16_VCMUL__NEONFP16ARITH_U32, batch_lt_32) TEST(F16_VCMUL__NEONFP16ARITH_U32, batch_gt_32) TEST(F16_VCMUL__NEONFP16ARITH_U32, inplace_a) TEST(F16_VCMUL__NEONFP16ARITH_U32, inplace_b) TEST(F16_VCMUL__NEONFP16ARITH_U32, inplace_a_and_b)

Log analysis: In the source code of xnnpack, the void xnn_f16_vcmul_ukernel_neonfp16Arith_u8 function requires a 4-byte aligned address (memory address should be a multiple of 4) when converting the (uint16_t *) type address to (uint32_t *) type. Otherwise, there may be memory address misaligned access issues, leading to a crash and immediate exit with a Signal 7 error. 222

1、Related source file path: XNNPACK/src/f16-vcmul/gen/f16-vcmul-neonfp16arith-u8.c XNNPACK/src/f16-vcmul/gen/f16-vcmul-neonfp16arith-u16.c XNNPACK/src/f16-vcmul/gen/f16-vcmul-neonfp16arith-u32.c

2、Declaration of failed function: void xnn_f16_vcmul_ukernel__neonfp16arith_u8( size_t batch, const void* input_a, const void* input_b, void* output, const union xnn_f16_default_params params[restrict XNN_MIN_ELEMENTS(1)]) XNN_OOB_READS

void xnn_f16_vcmul_ukernel__neonfp16arith_u16( size_t batch, const void* input_a, const void* input_b, void* output, const union xnn_f16_default_params params[restrict XNN_MIN_ELEMENTS(1)]) XNN_OOB_READS

void xnn_f16_vcmul_ukernel__neonfp16arith_u32( size_t batch, const void* input_a, const void* input_b, void* output, const union xnn_f16_default_params params[restrict XNN_MIN_ELEMENTS(1)]) XNN_OOB_READS

3、Call logic for failed functions: // Call optimized micro-kernel. vcmul(batch_size() * sizeof(uint16_t), a_data, b_data, y.data(), init_params != nullptr ? &params : nullptr);

// Call optimized micro-kernel. vcmul(batch_size() * sizeof(float), a_data, b_data, y.data(), init_params != nullptr ? &params : nullptr);

JamesWang2007 avatar Mar 21 '24 02:03 JamesWang2007

Now that I modify it to (void *) and run the test case normally, may it really be a bug ?

fix

JamesWang2007 avatar Mar 21 '24 08:03 JamesWang2007

Hi, thanks for reporting this. Fix incoming

alankelly avatar Mar 27 '24 06:03 alankelly

Thanks for catching the alignment issue

Was vst1_lane_u32((uint32_t*) or, vreinterpret_u32_f16(vaccr_lo), 0); or += 2;
   9d740: 04 00 10 e3    tst  r0, #4
   9d744: 03 00 00 0a    beq  0x9d758 <xnn_f16_vcmul_ukernel__neonfp16arith_u8+0xe4> @ imm = #12
   9d748: 3d 38 c3 f4    vst1.32  {d19[0]}, [r3:32]!
   9d74c: a3 34 f3 f2    vext.32  d19, d19, d19, #1
   9d750: 3d 18 cc f4    vst1.32  {d17[0]}, [r12:32]!
   9d754: a1 14 f1 f2    vext.32  d17, d17, d17, #1
   9d758: 02 00 10 e3    tst  r0, #2
   9d75c: 10 80 bd 08    popeq  {r4, pc}

Now vst1_lane_u32((void*) or, vreinterpret_u32_f16(vaccr_lo), 0); or += 2;
   9d740: 04 00 10 e3    tst  r0, #4
   9d744: 03 00 00 0a    beq  0x9d758 <xnn_f16_vcmul_ukernel__neonfp16arith_u8+0xe4> @ imm = #12
   9d748: 0d 38 c3 f4    vst1.32  {d19[0]}, [r3]!
   9d74c: a3 34 f3 f2    vext.32  d19, d19, d19, #1
   9d750: 0d 18 cc f4    vst1.32  {d17[0]}, [r12]!
   9d754: a1 14 f1 f2    vext.32  d17, d17, d17, #1
   9d758: 02 00 10 e3    tst  r0, #2
   9d75c: 10 80 bd 08    popeq  {r4, pc}

fbarchard avatar Mar 27 '24 08:03 fbarchard

Thanks for the report. This was fixed back in March in https://github.com/google/XNNPACK/commit/e5c870c1309a7857ceb9e4f5a2be30087d485874

alankelly avatar Jul 11 '24 13:07 alankelly