TNN
TNN copied to clipboard
x86的HardSwish层的非avx实现有问题
TNN的模型里的HardSwish层有两种情况,一种是input0和input1的shape完全一样,都是(b, c, h, w),一种情况是input0的shape是(b, c, h, w),input1的shape是(b, c, 1, 1),x86的非AVX实现
`for (int b = 0; b < batch; b++) {
for (int c = 0; c < channel; c++) {
auto input_data0 = input_ptr0 + (b * channel + c) * channel_size;
auto input_data1 = input_ptr1 + (b * channel + c) * channel_size;
auto output_data = output_ptr + (b * channel + c) * channel_size;
for (int index = 0; index < channel_size; index++) {
float tmp = input_data1[index] * alpha + beta;
tmp = std::min(tmp, 1.f);
tmp = std::max(tmp, 0.f);
output_data[index] = input_data0[index] * tmp;
}
}
}`
只考虑了第一种情况,没有考虑第二种情况。