MNN
MNN copied to clipboard
opt(RVV): Optimize max and min float functions with intrinsics
Summary
Optimize MNNMaxFloat and MNNMinFloat using RVV intrinsics.
Environment
- Platform: sg2044
- OS: EulixOS 3.0
Benchmark
Click to expand full test logs
[root@openeuler-riscv64 hebo]# ./test_max_float
inputCountUnit=4
Scalar time: 0.0000 sec
RVV time : 0.0000 sec
Speedup : 0.04x
Test inputCountUnit=4: PASSED
inputCountUnit=1
Scalar time: 0.0000 sec
RVV time : 0.0000 sec
Speedup : 0.00x
Test inputCountUnit=1: PASSED
inputCountUnit=3
Scalar time: 0.0000 sec
RVV time : 0.0000 sec
Speedup : 0.00x
Test inputCountUnit=3: PASSED
inputCountUnit=65536
Scalar time: 0.0086 sec
RVV time : 0.0017 sec
Speedup : 5.02x
Test inputCountUnit=65536: PASSED
inputCountUnit=1000000
Scalar time: 0.1321 sec
RVV time : 0.0338 sec
Speedup : 3.91x
Test inputCountUnit=1000000: PASSED
inputCountUnit=10000000
Scalar time: 1.3309 sec
RVV time : 0.3729 sec
Speedup : 3.57x
Test inputCountUnit=10000000: PASSED
All tests PASSED
[root@openeuler-riscv64 hebo]# ./test_min_float
inputCountUnit=4
Scalar time: 0.0000 sec
RVV time : 0.0000 sec
Speedup : 0.08x
Test inputCountUnit=4: PASSED
inputCountUnit=1
Scalar time: 0.0000 sec
RVV time : 0.0000 sec
Speedup : 0.00x
Test inputCountUnit=1: PASSED
inputCountUnit=3
Scalar time: 0.0000 sec
RVV time : 0.0000 sec
Speedup : 1.00x
Test inputCountUnit=3: PASSED
inputCountUnit=65536
Scalar time: 0.0105 sec
RVV time : 0.0017 sec
Speedup : 6.34x
Test inputCountUnit=65536: PASSED
inputCountUnit=1000000
Scalar time: 0.1587 sec
RVV time : 0.0340 sec
Speedup : 4.67x
Test inputCountUnit=1000000: PASSED
inputCountUnit=10000000
Scalar time: 1.5811 sec
RVV time : 0.3776 sec
Speedup : 4.19x
Test inputCountUnit=10000000: PASSED
All tests PASSED
</details>