crypto
crypto copied to clipboard
crypto/chacha20, crypto/poly1305: add MIPSLE assembly version
Add assembly optimized versions for ChaCha20 and Poly1305 crypto algorithms for MIPSLE.
The algorithms have been ported from other ASM implementations, both of which are dual licensed under “GPL-2.0 OR MIT”
- https://github.com/torvalds/linux/blob/1b294a1f35616977caddaddf3e9d28e576a1adbc/arch/mips/crypto/chacha-core.S
- https://github.com/WireGuard/wireguard-monolithic-historical/blob/edad0d6e99e5133b1e8e865d727a25fff6399cb4/src/crypto/zinc/poly1305/poly1305-mips.S
The following are benchmarks done on a MT7688. It compares the base go implementation with the assembly version, once with a MIPS32r1 IS and once with MIPS32r2 IS.
goos: linux goarch: mipsle pkg: golang.org/x/crypto/chacha20 │ old.txt │ asm.txt │ asm-mips32r2.txt │ │ B/s │ B/s vs base │ B/s vs base │ ChaCha20/64 4.015Mi ± 1% 10.376Mi ± 1% +158.43% (p=0.000 n=10) 13.485Mi ± 2% +235.87% (p=0.000 n=10) ChaCha20/256 4.473Mi ± 1% 12.846Mi ± 1% +187.21% (p=0.000 n=10) 18.859Mi ± 3% +321.64% (p=0.000 n=10) ChaCha20/10x25 3.119Mi ± 1% 6.104Mi ± 2% +95.72% (p=0.000 n=10) 7.181Mi ± 3% +130.28% (p=0.000 n=10) ChaCha20/4096 4.659Mi ± 4% 13.609Mi ± 4% +192.12% (p=0.000 n=10) 20.270Mi ± 5% +335.11% (p=0.000 n=10) ChaCha20/100x40 4.020Mi ± 2% 9.918Mi ± 3% +146.74% (p=0.000 n=10) 13.433Mi ± 5% +234.16% (p=0.000 n=10) ChaCha20/65536 4.301Mi ± 1% 9.727Mi ± 1% +126.16% (p=0.000 n=10) 12.393Mi ± 0% +188.14% (p=0.000 n=10) ChaCha20/1000x65 4.187Mi ± 1% 10.076Mi ± 2% +140.66% (p=0.000 n=10) 13.032Mi ± 2% +211.28% (p=0.000 n=10) geomean 4.082Mi 10.11Mi +147.56% 13.47Mi +229.90%
pkg: golang.org/x/crypto/internal/poly1305 │ old.txt │ asm.txt │ asm-mips32r2.txt │ │ B/s │ B/s vs base │ B/s vs base │ 64 5.307Mi ± 0% 21.009Mi ± 0% +295.87% (p=0.000 n=10) 20.938Mi ± 0% +294.52% (p=0.000 n=10) 1K 6.566Mi ± 1% 66.676Mi ± 0% +915.47% (p=0.000 n=10) 66.042Mi ± 0% +905.81% (p=0.000 n=10) 2M 5.140Mi ± 1% 47.135Mi ± 0% +816.98% (p=0.000 n=10) 47.016Mi ± 0% +814.66% (p=0.000 n=10) 64Unaligned 5.322Mi ± 1% 21.024Mi ± 0% +295.07% (p=0.000 n=10) 20.871Mi ± 1% +292.20% (p=0.000 n=10) 1KUnaligned 6.561Mi ± 0% 66.614Mi ± 0% +915.26% (p=0.000 n=10) 66.333Mi ± 0% +910.97% (p=0.000 n=10) 2MUnaligned 5.140Mi ± 1% 47.197Mi ± 1% +818.18% (p=0.000 n=10) 47.126Mi ± 0% +816.79% (p=0.000 n=10) Write64 6.599Mi ± 0% 57.268Mi ± 0% +767.77% (p=0.000 n=10) 57.368Mi ± 0% +769.29% (p=0.000 n=10) Write1K 6.819Mi ± 0% 79.408Mi ± 0% +1064.55% (p=0.000 n=10) 79.246Mi ± 0% +1062.17% (p=0.000 n=10) Write2M 5.140Mi ± 0% 47.169Mi ± 0% +817.63% (p=0.000 n=10) 47.116Mi ± 0% +816.60% (p=0.000 n=10) Write64Unaligned 6.428Mi ± 3% 56.992Mi ± 1% +786.65% (p=0.000 n=10) 56.424Mi ± 1% +777.82% (p=0.000 n=10) Write1KUnaligned 6.814Mi ± 2% 79.293Mi ± 0% +1063.68% (p=0.000 n=10) 79.513Mi ± 0% +1066.90% (p=0.000 n=10) Write2MUnaligned 5.016Mi ± 2% 47.183Mi ± 1% +840.59% (p=0.000 n=10) 47.183Mi ± 0% +840.59% (p=0.000 n=10) geomean 5.858Mi 49.17Mi +739.29% 49.02Mi +736.70%
pkg: golang.org/x/crypto/chacha20poly1305 │ old.txt │ asm.txt │ asm-mips32r2.txt │ │ B/s │ B/s vs base │ B/s vs base │ Chacha20Poly1305/Open-64 1.230Mi ± 4% 3.042Mi ± 1% +147.29% (p=0.000 n=10) 3.548Mi ± 2% +188.37% (p=0.000 n=10) Chacha20Poly1305/Seal-64 1.144Mi ± 1% 3.462Mi ± 1% +202.50% (p=0.000 n=10) 3.810Mi ± 1% +232.92% (p=0.000 n=10) Chacha20Poly1305/Open-64-X 908.2Ki ± 1% 1718.8Ki ± 2% +89.25% (p=0.000 n=10) 1840.8Ki ± 2% +102.69% (p=0.000 n=10) Chacha20Poly1305/Seal-64-X 839.8Ki ± 1% 1894.5Ki ± 2% +125.58% (p=0.000 n=10) 2006.8Ki ± 2% +138.95% (p=0.000 n=10) Chacha20Poly1305/Open-1024 2.594Mi ± 3% 9.975Mi ± 1% +284.56% (p=0.000 n=10) 13.208Mi ± 3% +409.19% (p=0.000 n=10) Chacha20Poly1305/Seal-1024 2.551Mi ± 1% 10.600Mi ± 2% +315.51% (p=0.000 n=10) 14.353Mi ± 3% +462.62% (p=0.000 n=10) Chacha20Poly1305/Open-1024-X 2.470Mi ± 0% 8.569Mi ± 0% +246.91% (p=0.000 n=10) 10.705Mi ± 2% +333.40% (p=0.000 n=10) Chacha20Poly1305/Seal-1024-X 2.413Mi ± 1% 9.036Mi ± 1% +274.51% (p=0.000 n=10) 11.330Mi ± 1% +369.57% (p=0.000 n=10) Chacha20Poly1305/Open-1350 2.594Mi ± 3% 9.899Mi ± 2% +281.62% (p=0.000 n=10) 13.237Mi ± 2% +410.29% (p=0.000 n=10) Chacha20Poly1305/Seal-1350 2.556Mi ± 1% 10.471Mi ± 1% +309.70% (p=0.000 n=10) 13.452Mi ± 1% +426.31% (p=0.000 n=10) Chacha20Poly1305/Open-1350-X 2.503Mi ± 2% 8.817Mi ± 1% +252.19% (p=0.000 n=10) 11.382Mi ± 1% +354.67% (p=0.000 n=10) Chacha20Poly1305/Seal-1350-X 2.460Mi ± 0% 9.093Mi ± 1% +269.57% (p=0.000 n=10) 11.873Mi ± 2% +382.56% (p=0.000 n=10) Chacha20Poly1305/Open-2048 2.694Mi ± 2% 11.024Mi ± 2% +309.20% (p=0.000 n=10) 14.963Mi ± 1% +455.40% (p=0.000 n=10) Chacha20Poly1305/Seal-2048 2.699Mi ± 0% 11.477Mi ± 2% +325.27% (p=0.000 n=10) 15.240Mi ± 1% +464.66% (p=0.000 n=10) Chacha20Poly1305/Open-2048-X 2.637Mi ± 1% 10.056Mi ± 1% +281.37% (p=0.000 n=10) 13.375Mi ± 1% +407.23% (p=0.000 n=10) Chacha20Poly1305/Seal-2048-X 2.627Mi ± 1% 10.328Mi ± 2% +293.10% (p=0.000 n=10) 13.819Mi ± 2% +425.95% (p=0.000 n=10) Chacha20Poly1305/Open-4096 2.732Mi ± 5% 11.225Mi ± 4% +310.82% (p=0.000 n=10) 16.041Mi ± 4% +487.09% (p=0.000 n=10) Chacha20Poly1305/Seal-4096 2.704Mi ± 2% 10.839Mi ± 7% +300.88% (p=0.000 n=10) 15.693Mi ± 7% +480.42% (p=0.000 n=10) Chacha20Poly1305/Open-4096-X 2.670Mi ± 1% 10.381Mi ± 4% +288.75% (p=0.000 n=10) 15.035Mi ± 4% +463.04% (p=0.000 n=10) Chacha20Poly1305/Seal-4096-X 2.680Mi ± 1% 10.867Mi ± 5% +305.52% (p=0.000 n=10) 15.421Mi ± 7% +475.44% (p=0.000 n=10) Chacha20Poly1305/Open-8192 2.708Mi ± 2% 11.053Mi ± 3% +308.10% (p=0.000 n=10) 15.926Mi ± 5% +488.03% (p=0.000 n=10) Chacha20Poly1305/Seal-8192 2.632Mi ± 4% 10.896Mi ± 6% +313.95% (p=0.000 n=10) 16.031Mi ± 5% +509.06% (p=0.000 n=10) Chacha20Poly1305/Open-8192-X 2.666Mi ± 4% 10.948Mi ± 4% +310.73% (p=0.000 n=10) 15.855Mi ± 3% +494.81% (p=0.000 n=10) Chacha20Poly1305/Seal-8192-X 2.637Mi ± 2% 10.805Mi ± 2% +309.76% (p=0.000 n=10) 14.725Mi ± 6% +458.41% (p=0.000 n=10) Chacha20Poly1305/Open-16384 2.499Mi ± 4% 10.405Mi ± 13% +316.41% (p=0.000 n=10) 13.628Mi ± 7% +445.42% (p=0.000 n=10) Chacha20Poly1305/Seal-16384 2.484Mi ± 4% 9.069Mi ± 4% +265.07% (p=0.000 n=10) 12.131Mi ± 3% +388.29% (p=0.000 n=10) Chacha20Poly1305/Open-16384-X 2.389Mi ± 7% 10.028Mi ± 5% +319.76% (p=0.000 n=10) 14.472Mi ± 3% +505.79% (p=0.000 n=10) Chacha20Poly1305/Seal-16384-X 2.475Mi ± 4% 9.084Mi ± 2% +267.05% (p=0.000 n=10) 12.212Mi ± 6% +393.45% (p=0.000 n=10) geomean 2.259Mi 8.271Mi +266.21% 10.90Mi +382.79%
Fixes golang/go#39139
This PR (HEAD: 2de561af3f503613e18d4cfdfc2b17b07d79dff3) has been imported to Gerrit for code review.
Please visit Gerrit at https://go-review.googlesource.com/c/crypto/+/585755.
Important tips:
- Don't comment on this PR. All discussion takes place in Gerrit.
- You need a Gmail or other Google account to log in to Gerrit.
- To change your code in response to feedback:
- Push a new commit to the branch used by your GitHub PR.
- A new "patch set" will then appear in Gerrit.
- Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
- Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
- Multiple commits in the PR will be squashed by GerritBot.
- The title and description of the GitHub PR are used to construct the final commit message.
- Edit these as needed via the GitHub web interface (not via Gerrit or git).
- You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
- See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.
Message from Gopher Robot:
Patch Set 1:
(1 comment)
Please don’t reply on this GitHub thread. Visit golang.org/cl/585755. After addressing review feedback, remember to publish your drafts!
Message from --:
Patch Set 2:
(1 comment)
Please don’t reply on this GitHub thread. Visit golang.org/cl/585755. After addressing review feedback, remember to publish your drafts!
Message from --:
Patch Set 2:
(1 comment)
Please don’t reply on this GitHub thread. Visit golang.org/cl/585755. After addressing review feedback, remember to publish your drafts!