suricata
suricata copied to clipboard
Implement Memcmp SIMD for arm64 NEON and SVE
Make sure these boxes are signed before submitting your Pull Request -- thank you.
- [X] I have read the contributing guide lines at https://suricata.readthedocs.io/en/latest/devguide/codebase/contributing/contribution-process.html
- [X] I have signed the Open Information Security Foundation contribution agreement at https://suricata.io/about/contribution-agreement/ A request has been sent to add me to our corporate contribution agreemen
- [X] I have updated the user guide (in doc/userguide/) to reflect the changes made (if applicable) NA
Link to redmine ticket:
Describe changes:
- lengthen memcmp test input length to exercise simd instructions
- add timing assembly for arm64
- implement Memcmp SIMD for arm64 NEON and SVE
Provide values to any of the below to override the defaults.
To use a pull request use a branch name like pr/N
where N
is the pull request number.
SV_REPO=
SV_BRANCH=
SU_REPO=
SU_BRANCH=
LIBHTP_REPO=
LIBHTP_BRANCH=
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
6896a93
) 82.12% compared to head (4158a9b
) 82.14%.
Additional details and impacted files
@@ Coverage Diff @@
## master #8764 +/- ##
==========================================
+ Coverage 82.12% 82.14% +0.02%
==========================================
Files 975 975
Lines 271724 271736 +12
==========================================
+ Hits 223151 223222 +71
+ Misses 48573 48514 -59
Flag | Coverage Δ | |
---|---|---|
fuzzcorpus | 62.79% <ø> (+0.07%) |
:arrow_up: |
suricata-verify | 61.41% <ø> (-0.01%) |
:arrow_down: |
unittests | 62.84% <100.00%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Could you also update the SIMD support
line in --build-info
?
E.g.
./src/suricata --build-info|grep SSE
SIMD support: SSE_4_2 SSE_4_1 SSE_3
Did you test this with suricata-verify
? We have no arm64 running in our public CI, and I think my personal arm64 devices have no NEON/SVE, although I'm not quite sure. My most modern runner (rock 5b) has A76 cores. Not sure if that is compatible.
Formatting will need to be fixed.
Additionally, I would love to see to benchmarks on what the effect of using the SIMD is here.
Thanks for looking @victorjulien
Did you test this with suricata-verify?
PASSED: 1200
FAILED: 0
SKIPPED: 68
Formatting will need to be fixed.
Done
I think my personal arm64 devices have no NEON/SVE
A76 has NEON, but doesn't have SVE
Could you also update the SIMD support line in --build-info?
./src/suricata --build-info | grep SIMD
SIMD support: NEON SVE
Performance
It's obviously input dependent. If the strings are always going to fail on the first character the scalar implementation is great while the SIMD implementation has done extra work for 15 bytes. I'm partly making the assumption that this was a worthwhile optimization for SSE/AVX so it should be for NEON since i don't have a good set of data to throw at the problem. I have created a little micro-benchmark of 20 different strings some that fail early, some late, and some matches. This is part of the reason you don't see a SCMemcmpLowercase SVE implementation, as i can't make that faster than the NEON one even the the code is far more elegant. On my tests the NEON implementation of SCMemcmpLowercase s about 40% faster. The NEON implementation of SCMemcmp is just memcmp() since that is what we'd do anyway and the SVE SCMemcmp impl is around 10% faster.
victorjulien@ anything else you’d like to see?
thanks for the ping. I'll improve them after the holidays and resubmit.
NOTE: This PR may contain new authors.
NOTE: This PR may contain new authors.
Could you please resubmit as a new rebased PR ? This is Suricata's Github workflow cf https://docs.suricata.io/en/latest/devguide/contributing/code-submission-process.html#pull-requests
Closing as stale, please feel free to reopen a new rebased PR with the suggestions.