Potential performance improvements for findIterations()
The current findIterations method is too slow for my use-case. I intend to re-evaluate the number of iterations in a random interval to tailor the parameters to my hardware, and the current implementation is too inefficient and wasting a lot of resources when it is run multiple times during application runtime.
I have written a custom implementation that takes advantage of the fact that the hashing time grows roughly linear with the number of iterations.
Hashing time on my laptop for $$m = 8192$$ and $$p = 4$$ with incrementing number of iterations
My custom findIterations method
private int measure(long start) {
final long end = System.nanoTime();
return Long.valueOf((end - start) / 1_000_000).intValue();
}
protected int findIterations(Argon2Factory.Argon2Types type, int maxMillis, int m, int p) {
final var argon = Argon2Factory.create(type);
warmup(argon, "password".toCharArray());
// first do single iteration and see where we're at
int iterations = 1;
long start = System.nanoTime();
argon.hash(iterations, m, p, "password".toCharArray());
int took = measure(start);
// if one iteration already takes more than maxMillis, use one iteration
if (took > maxMillis) return 1;
// if one iteration takes less than a third of maxMillis, bump iterations to 3 to get more accurate results
if (took < (maxMillis / 3)) iterations = 3;
// do five rounds of hashing with those iterations and measure the execution time
int[] measurements = new int[5];
for (int i = 0; i < measurements.length; i++) {
start = System.nanoTime();
argon.hash(iterations, m, p, "password".toCharArray());
// divide measurement by amount of iterations, to get the approximate time one iteration would take
measurements[i] = measure(start) / iterations;
}
// get the average time it took for one iteration
var avg = Arrays.stream(measurements).average().orElse(0.0);
// return the approximated amount of iterations to stay within maxMillis, with a lower bound of 1 iteration
return Math.max(1, Math.floorDiv(maxMillis, Double.valueOf(avg).intValue()));
}
My custom implementation vastly outperforms the library method, and the results of $$t$$ are within a very similar range. Especially for large numbers of iterations, the method as provided in the library will take a long time. See comparison below.
| variant | max millis | memory | parallelism | result | hash time (5/avg) | $$t$$ found in |
|---|---|---|---|---|---|---|
| library method | 1000 | 8192 kiB | 1 | $$t = 129$$ | 848 ms |
54997 ms |
| my method | 1000 | 8192 kiB | 1 | $$t = 166$$ | 1005 ms |
99 ms |
| variant | max millis | memory | parallelism | result | hash time (5/avg) | $$t$$ found in |
|---|---|---|---|---|---|---|
| library method | 1000 | 65536 kiB | 1 | $$t = 18$$ | 967 ms |
10751 ms |
| my method | 1000 | 65536 kiB | 1 | $$t = 18$$ | 1028 ms |
875 ms |
| variant | max millis | memory | parallelism | result | hash time (5/avg) | $$t$$ found in |
|---|---|---|---|---|---|---|
| library method | 1000 | 1048576 kiB | 1 | $$t = 1$$ | 1135 ms |
1131 ms |
| my method | 1000 | 1048576 kiB | 1 | $$t = 1$$ | 1136 ms |
1135 ms |
| variant | max millis | memory | parallelism | result | hash time (5/avg) | $$t$$ found in |
|---|---|---|---|---|---|---|
| library method | 1000 | 8192 kiB | 4 | $$t = 303$$ | 939 ms |
145418 ms |
| my method | 1000 | 8192 kiB | 4 | $$t = 333$$ | 1002 ms |
57 ms |
| variant | max millis | memory | parallelism | result | hash time (5/avg) | $$t$$ found in |
|---|---|---|---|---|---|---|
| library method | 1000 | 65536 kiB | 4 | $$t = 42$$ | 920 ms |
20730 ms |
| my method | 1000 | 65536 kiB | 4 | $$t = 47$$ | 977 ms |
351 ms |
| variant | max millis | memory | parallelism | result | hash time (5/avg) | $$t$$ found in |
|---|---|---|---|---|---|---|
| library method | 1000 | 1048576 kiB | 4 | $$t = 2$$ | 872 ms |
2623 ms |
| my method | 1000 | 1048576 kiB | 4 | $$t = 2$$ | 791 ms |
2411 ms |
[!NOTE] The second to last column shows the time it takes to hash with the evaluated $$t$$ parameter, it was measured by measuring 5 times and taking the average. The last column shows the amount of time it took to find the number of iterations.
Hello,
thanks for the report! I agree, the current implementation is very naive. It's not intended to be run all the time, just one time on a new hardware, and then put the iteration count in a config file somewhere.
But looking at your results, I think we can use your algorithm, as I don't see any downsides. Would you mind creating a PR?
Yeah, that was how I was running this until now, figuring out the iterations once and then keep them fixed. However, I found that in some real-world scenarios, this led to imperfect results, which is why I'm shifting to a more dynamic approach.
- Sometimes the method was run when the system was otherwise under high load (cron jobs, backups, other applications, ...), which lead to weaker hashes. I'm now taking multiple samples at random times and taking a mean value
- Sometimes the applications run for months at a time, the circumstances on the system may change during that lifetime (more applications get installed on the same hardware, competing for resources), in some cases this lead to bad UX because of long authentication times
Sure, my method from above has a few quirks, I can patch them up and provide a PR sometime :+1: