john icon indicating copy to clipboard operation
john copied to clipboard

Effectively-duplicate rule suppression

Open solardiz opened this issue 8 months ago • 1 comments

We already have suppression of literally duplicate wordlist rules (although there may be bugs with it, see #5011), however we do not yet suppress rules that look different but are effectively duplicate (redundant).

https://github.com/mhasbini/duprule is one external project that does this, so we could consider their approach (vs. or as well as other ideas):

How does it works ?

TL;DR: Each rule change is mapped, and a unique id is generated for each rule with functions count.

The mechanism is like this:

    A blank map is created with N ( from 1 to 36 ) slots.
    Each rule change will be applied to the map. Example rule: 'u', will change all characters cases from '?' ( unknown ) to 'u' ( upper case ). 'sab', will add {'a' -> 'b'} to the map. And same logic apply for the other rules.
    An id is generated from the map.
    The ids are compared to detect duplicate rules.
    The rule with the least functions count will be selected.

solardiz avatar Mar 16 '25 17:03 solardiz

The referenced duprule tool is now in Rust, but it was in Perl until commit:

commit 05e73e913a3899931c2e46ca2ec65201ebadbfec
Author: mhasbini <[email protected]>
Date:   Mon Sep 11 22:27:48 2017 +0300

    remove L, R, + & - rules; Rewrite in Rust

solardiz avatar Mar 18 '25 00:03 solardiz