suricata icon indicating copy to clipboard operation
suricata copied to clipboard

Implement Memcmp SIMD for arm64 NEON and SVE

Open AGSaidi opened this issue 1 year ago • 9 comments

Make sure these boxes are signed before submitting your Pull Request -- thank you.

  • [X] I have read the contributing guide lines at https://suricata.readthedocs.io/en/latest/devguide/codebase/contributing/contribution-process.html
  • [X] I have signed the Open Information Security Foundation contribution agreement at https://suricata.io/about/contribution-agreement/ A request has been sent to add me to our corporate contribution agreemen
  • [X] I have updated the user guide (in doc/userguide/) to reflect the changes made (if applicable) NA

Link to redmine ticket:

Describe changes:

  • lengthen memcmp test input length to exercise simd instructions
  • add timing assembly for arm64
  • implement Memcmp SIMD for arm64 NEON and SVE

Provide values to any of the below to override the defaults.

To use a pull request use a branch name like pr/N where N is the pull request number.

SV_REPO=
SV_BRANCH=
SU_REPO=
SU_BRANCH=
LIBHTP_REPO=
LIBHTP_BRANCH=

AGSaidi avatar Apr 25 '23 20:04 AGSaidi

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (6896a93) 82.12% compared to head (4158a9b) 82.14%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8764      +/-   ##
==========================================
+ Coverage   82.12%   82.14%   +0.02%     
==========================================
  Files         975      975              
  Lines      271724   271736      +12     
==========================================
+ Hits       223151   223222      +71     
+ Misses      48573    48514      -59     
Flag Coverage Δ
fuzzcorpus 62.79% <ø> (+0.07%) :arrow_up:
suricata-verify 61.41% <ø> (-0.01%) :arrow_down:
unittests 62.84% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

codecov[bot] avatar Apr 26 '23 05:04 codecov[bot]

Could you also update the SIMD support line in --build-info?

E.g.

./src/suricata --build-info|grep SSE
SIMD support: SSE_4_2 SSE_4_1 SSE_3

Did you test this with suricata-verify? We have no arm64 running in our public CI, and I think my personal arm64 devices have no NEON/SVE, although I'm not quite sure. My most modern runner (rock 5b) has A76 cores. Not sure if that is compatible.

Formatting will need to be fixed.

victorjulien avatar Apr 26 '23 05:04 victorjulien

Additionally, I would love to see to benchmarks on what the effect of using the SIMD is here.

victorjulien avatar Apr 26 '23 06:04 victorjulien

Thanks for looking @victorjulien

Did you test this with suricata-verify?

PASSED:  1200
FAILED:  0
SKIPPED: 68

Formatting will need to be fixed.

Done

I think my personal arm64 devices have no NEON/SVE

A76 has NEON, but doesn't have SVE

Could you also update the SIMD support line in --build-info?

./src/suricata --build-info | grep SIMD
SIMD support: NEON SVE

Performance

It's obviously input dependent. If the strings are always going to fail on the first character the scalar implementation is great while the SIMD implementation has done extra work for 15 bytes. I'm partly making the assumption that this was a worthwhile optimization for SSE/AVX so it should be for NEON since i don't have a good set of data to throw at the problem. I have created a little micro-benchmark of 20 different strings some that fail early, some late, and some matches. This is part of the reason you don't see a SCMemcmpLowercase SVE implementation, as i can't make that faster than the NEON one even the the code is far more elegant. On my tests the NEON implementation of SCMemcmpLowercase s about 40% faster. The NEON implementation of SCMemcmp is just memcmp() since that is what we'd do anyway and the SVE SCMemcmp impl is around 10% faster.

AGSaidi avatar Apr 26 '23 15:04 AGSaidi

victorjulien@ anything else you’d like to see?

AGSaidi avatar May 05 '23 00:05 AGSaidi

thanks for the ping. I'll improve them after the holidays and resubmit.

AGSaidi avatar Dec 21 '23 20:12 AGSaidi

NOTE: This PR may contain new authors.

github-actions[bot] avatar Jan 18 '24 10:01 github-actions[bot]

NOTE: This PR may contain new authors.

github-actions[bot] avatar Jan 18 '24 20:01 github-actions[bot]

Could you please resubmit as a new rebased PR ? This is Suricata's Github workflow cf https://docs.suricata.io/en/latest/devguide/contributing/code-submission-process.html#pull-requests

catenacyber avatar Mar 01 '24 21:03 catenacyber

Closing as stale, please feel free to reopen a new rebased PR with the suggestions.

catenacyber avatar May 14 '24 12:05 catenacyber