Smart wordlist+rules mode
We currently support words-first (with --rules) or rules-first (with --rules-stack) or a mix of these two with two rule sets one applied on top of another. That's kind of up to 3 nested loops. Simple to understand, but not optimal.
We could (in a new mode?) be simultaneously changing the number of words and rules in use. As I wrote in #4952, "A reason to support multiple wordlists there would be to run with more rules against a tiny wordlist, then fewer rules against a medium-sized wordlist, and the fewest against the largest." However, we could have an automated version of the same built into a cracking mode, and it'd support finer granularity (down to 1 rule and maybe even 1 word), its ordering would likely end up more optimal than what we'd do manually, and it would not ever terminate (or could be continued) when there are still more things to try (progressively less likely combinations of words and rules - theoretically eventually completing the largest wordlist with the largest rule set).
This is loosely similar to how our incremental mode changes the password length and character count (and other parameters, but to illustrate the similarity two are sufficient), but with word count and rule count. This does mean that either one of these two counts can temporarily decrease if the other increases. We then need logic, similar to what incremental mode has, to search just the portions not overlapping with what was already searched (so e.g., if rule 10 was already applied to words 1 to 100, but is later revisited, we should restart it at word 101, yet e.g. rule 11 could restart at word 51 if it had only been applied to 1 to 50 before).
The sorting of those {word count, rule count} combinations could be for increasing effort per combination (excluding effort that would overlap with what was done before, so with hopefully-converging re-sorting similar to what we have in charset.c) or it could be for decreasing estimated successful guess rate per candidate (we have that in incremental mode where charset.c has access to statistics from john.pot - could we also have a wordlist+rules ordering optimizer somehow enabled with suitable statistics, considering that john.pot alone would not suffice in this case?)