factor: Optimize factor command with advanced factorization algorithms
Implement modular factorization architecture with Pollard-Rho, ECM, and Fermat algorithms. Adds Montgomery arithmetic optimization and u64 fast-path for improved u128 performance. Dynamic algorithm selection based on input size routes through unified entry point.
Fixes #1456
--- Testing: 12 ---
uutils: 12: 2 2 3
GNU: 12: 2 2 3
uutils time: real 0m0.007s
GNU time: real 0m0.007s
--- Testing: 1000000007 ---
uutils: 1000000007: 1000000007
GNU: 1000000007: 1000000007
uutils time: real 0m0.005s
GNU time: real 0m0.006s
--- Testing: 9999999999999999999 ---
uutils: 9999999999999999999: 3 3 1111111111111111111
GNU: 9999999999999999999: 3 3 1111111111111111111
uutils time: real 0m0.004s
GNU time: real 0m0.007s
--- Testing: 18446744073709551557 ---
uutils: 18446744073709551557: 18446744073709551557
GNU: 18446744073709551557: 18446744073709551557
uutils time: real 0m0.004s
GNU time: real 0m0.007s
--- Testing: 123456789012345678901234567890 ---
uutils: 123456789012345678901234567890: 2 3 3 3 5 7 13 31 37 211 241 2161 3607 3803 2906161
GNU: 123456789012345678901234567890: 2 3 3 3 5 7 13 31 37 211 241 2161 3607 3803 2906161
uutils time: real 0m0.004s
GNU time: real 0m0.006s
--- Testing: 2305843009213693951 ---
uutils: 2305843009213693951: 2305843009213693951
GNU: 2305843009213693951: 2305843009213693951
uutils time: real 0m0.004s
GNU time: real 0m0.007s
--- Testing: 6469693230 ---
uutils: 6469693230: 2 3 5 7 11 13 17 19 23 29
GNU: 6469693230: 2 3 5 7 11 13 17 19 23 29
uutils time: real 0m0.004s
GNU time: real 0m0.006s
--- Testing: 340282366920938463463374607431768211297 ---
uutils: 340282366920938463463374607431768211297: 340282366920938463463374607431768211297
GNU: 340282366920938463463374607431768211297: 340282366920938463463374607431768211297
uutils time: real 0m0.004s
GNU time: real 0m0.006s
--- Testing: 9223372036854775783 ---
uutils: 9223372036854775783: 9223372036854775783
GNU: 9223372036854775783: 9223372036854775783
uutils time: real 0m0.004s
GNU time: real 0m0.007s
--- Testing: 18446744073709551615 ---
uutils: 18446744073709551615: 3 5 17 257 641 65537 6700417
GNU: 18446744073709551615: 3 5 17 257 641 65537 6700417
uutils time: real 0m0.010s
GNU time: real 0m0.006s
Benchmark 1: ./target/release/factor 18446744073709551557 (64-bit prime near 2^64)
Time (mean ± σ): 35.1 ms ± 1.8 ms [User: 25.7 ms, System: 8.1 ms]
Output: 18446744073709551557 (prime)
Benchmark 2: ./target/release/factor 9999999999999999999 (large u64)
Time (mean ± σ): 34.8 ms ± 1.2 ms [User: 25.7 ms, System: 7.9 ms]
Output: 3 3 1111111111111111111
Benchmark 3: ./target/release/factor 123456789012345678901234567890 (97-bit u128)
Time (mean ± σ): 36.3 ms ± 7.0 ms [User: 26.0 ms, System: 8.1 ms]
Output: 2 3 3 3 5 7 13 31 37 211 241 2161 3607 3803 2906161
Benchmark 4: ./target/release/factor 18446743979220271189 (64-bit semiprime)
Time (mean ± σ): 36.3 ms ± 8.1 ms [User: 25.9 ms, System: 8.1 ms]
Output: 4294967279 4294967291
All inputs processed in ~35-36ms consistently
@asder8215 👀
Can this close #1456 ?
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Wow, this looks awesome to see a highly optimized version of the factor command.
I won't have time to look at it today (most likely tomorrow afternoon), but this is good stuff right here
GNU testsuite comparison:
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
cool
a few things:
- some jobs are failing
- maybe add some benchmarks?!
- maybe it is why codspeed isn't showing improvements
- in comment #0 - i would prefer to have hyperfine output for benchmarks
CodSpeed Performance Report
Merging #9587 will not alter performance
Comparing naoNao89:perf/u128-performance-only (fe6b0e5) with main (415d01c)
Summary
✅ 126 untouched
🆕 12 new
⏩ 7 skipped[^skipped]
Benchmarks breakdown
| Benchmark | BASE |
HEAD |
Change | |
|---|---|---|---|---|
| 🆕 | factor_64bit_semiprime |
N/A | 176.7 µs | N/A |
| 🆕 | factorize_64bit_semiprime |
N/A | 20.7 µs | N/A |
| 🆕 | factor_medium_u64 |
N/A | 353.6 µs | N/A |
| 🆕 | factorize_96bit_composite |
N/A | 97.3 µs | N/A |
| 🆕 | factorize_large_prime |
N/A | 17.5 µs | N/A |
| 🆕 | factorize_small_u64 |
N/A | 105 µs | N/A |
| 🆕 | factorize_close_factors |
N/A | 10.9 µs | N/A |
| 🆕 | factor_batch_mixed |
N/A | 348.6 µs | N/A |
| 🆕 | factor_large_u64_prime |
N/A | 279.1 µs | N/A |
| 🆕 | factor_small_u64 |
N/A | 43.3 ms | N/A |
| 🆕 | factorize_32bit_semiprime |
N/A | 10.5 µs | N/A |
| 🆕 | factorize_120bit_mixed |
N/A | 20.7 ms | N/A |
| [^skipped]: 7 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. |
GNU testsuite comparison:
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Congrats! The gnu test tests/stty/stty-invalid is now passing!
Congrats! The gnu test tests/stty/stty-pairs is now passing!
Note: The gnu test tests/mv/i-3 was skipped on 'main' but is now failing.
Note: The gnu test tests/stty/stty was skipped on 'main' but is now failing.
Note: The gnu test tests/stty/stty-row-col was skipped on 'main' but is now failing.
GNU testsuite comparison:
Congrats! The gnu test tests/stty/stty-invalid is now passing!
Congrats! The gnu test tests/stty/stty-pairs is now passing!
Note: The gnu test tests/mv/i-3 was skipped on 'main' but is now failing.
Note: The gnu test tests/stty/stty was skipped on 'main' but is now failing.
Note: The gnu test tests/stty/stty-row-col was skipped on 'main' but is now failing.
GNU testsuite comparison:
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/cksum/b2sum is no longer failing!
GNU testsuite comparison:
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
GNU testsuite comparison:
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
GNU testsuite comparison:
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
GNU testsuite comparison:
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
next phase for 400bit
time target/release/factor
18042293517206497542309510624433271319838885652990797220840253733663197676853651276247377312
DEBUG: Attempting yamaquasi SIQS for 287-bit number
DEBUG: yamaquasi found factor: 73 bits
18042293517206497542309510624433271319838885652990797220840253733663197676853651276247377312: 2 2 2 2 2 11 293 7189204372506229734671 24333278690998283582226208220580048953269361177999035383847458677
target/release/factor 1.72s user 0.03s system 378% cpu 0.463 total