john icon indicating copy to clipboard operation
john copied to clipboard

incorrect ETA after early skipped rules

Open solardiz opened this issue 3 years ago • 3 comments

Running with --rules-skip-nop makes john jump over the : rule, which is commonly the first in a rule set, and this results in slightly incorrect progress percentage (starts at 0.07% with our new default rules) and in extremely incorrect ETA early on (says the job will complete almost instantly, then in days, then in months - when in reality it might not complete in years). Apparently, it appears to john that it can process a rule in almost zero time, and if so even once it's processed another rule (which can sometimes take a very long time) the speed estimate for ETA is still off by a factor of two.

solardiz avatar Feb 04 '22 20:02 solardiz

I guess a similar problem would be seen if the first rule is (almost) skipped for any other reason - e.g., if it rejects most words quickly, then the second rule preserves them and the hash is slow. So not sure if we should treat --rules-skip-nop specially, but we may.

solardiz avatar Feb 04 '22 20:02 solardiz

So any rule-reject flags (that do reject) skew the statistics similarly (and ones that also have lots of pp expansion do it a lot).

In some cases (first phase of loopback mode, and PRINCE mode), the rules are pre-processed into a list so the number of rejected rules are known beforehand. This issue doesn't seem to exist in those cases.

For the other cases, perhaps we could "trim" the rules count as we go? Alternatively, we could do that pre-process in all cases, but it's suboptimal for low memory and huge rulesets... unless we do something similar but never put them in a list.

BTW then there's the "pipe mode" situation, which is a combo of the above: We run a batch of words through all rules, then reset to first rule. Oh wait, we don't have any ETA then anyway.

magnumripper avatar Feb 07 '22 13:02 magnumripper

I confirmed that --rules-stack works fine: It counts the accepted rules beforehand.

It's very trivial to have rules_check() and/or rules_count() return the number of accepted rules even when not pre-loading rules to a list, but then we can't say things like Rule #2: '-s x**' rejected... or maybe we can, but keep saying #2 until we get an accepted one:

0:00:00:00 - Rule #1: ':' accepted as ''
0:00:00:00 - Rule #2: '-s x**' rejected
0:00:00:00 - Rule #2: '<* $1' accepted as '<*$1'

or, for --rules-skip-nop:

0:00:00:00 - Rule #1: ':' rejected
0:00:00:00 - Rule #1: '-s x**' rejected
0:00:00:00 - Rule #1: '<* $1' accepted as '<*$1'
0:00:00:00 - Rule #2: '-c (?a c Q' accepted as '(?acQ'

Or we can leave the number out for "rejected":

0:00:00:00 - Rule #1: ':' accepted as ''
0:00:00:00 - Rule '-s x**' rejected
0:00:00:00 - Rule #2: '<* $1' accepted as '<*$1'

We might break session resume if we're not careful here though.

magnumripper avatar Apr 08 '22 14:04 magnumripper