ncnn
ncnn copied to clipboard
milkv-duo segment fault in proxylessnasnet F0_expand
nihui,您好!
我将ncnn移植在milkv-duo上,在运行benchncnn时,发现在proxylessnasnet出现了segment fault错误。
我把运行的算子打印出来了,确定是F0_expand这个layer报错
[root@milkv-duo]/home/github/benchmark# ../bin/benchncnn
syscall error -1
loop_count = 4
num_threads = 1
powersave = 0
gpu_device = -1
cooling_down = 1
A0_linear
B0_expand
B0_linear
B1_linear
C0_expand
C0_linear
C1_linear
C2_linear
C3_linear
D0_expand
D0_linear
D1_linear
D2_linear
D3_linear
E0_expand
E0_linear
E1_linear
E2_linear
E3_linear
F0_expand
Segmentation fault
然后我使用GDB进行调试,定位到错误代码为convolution_sgemm_packn.h 的convolution_sgemm_packn函数
for (int j = 0; j < nn; j++)
{
float val0 = *tmpptr++;
float val1 = *tmpptr++;
float val2 = *tmpptr++;
float val3 = *tmpptr++;
float val4 = *tmpptr++;
float val5 = *tmpptr++;
float val6 = *tmpptr++;
float val7 = *tmpptr++;
//出现段错误的代码,为下面这一行
vfloat32m1_t _w0 = vle32_v_f32m1(kptr0, vl);
_sum0 = vfmacc_vf_f32m1(_sum0, val0, _w0, vl);
_sum1 = vfmacc_vf_f32m1(_sum1, val1, _w0, vl);
_sum2 = vfmacc_vf_f32m1(_sum2, val2, _w0, vl);
_sum3 = vfmacc_vf_f32m1(_sum3, val3, _w0, vl);
_sum4 = vfmacc_vf_f32m1(_sum4, val4, _w0, vl);
_sum5 = vfmacc_vf_f32m1(_sum5, val5, _w0, vl);
_sum6 = vfmacc_vf_f32m1(_sum6, val6, _w0, vl);
_sum7 = vfmacc_vf_f32m1(_sum7, val7, _w0, vl);
kptr0 += packn;
}
本来想自己解决,但弄了好久也没弄出结果。