Issue with Length data when training
It appears that the length data saved in LN.level has an off by one error when training passwords of length 4 are being saved as length 5, length 5 are being saved as length 6, etc, with junk data being saved for length 4, (using ngrams = 4).
For example, consider the training set. Note this does not have "junk" data for length 4 but I've seen that appear on larger training sets like the RockYou list:
test test1 test1 test12 test12 test12 test123 test123 test123 test123
So there is 1 of length 4, 2 of length 5, etc. Using the following command for training:
./createNG -F -v -n 4 --iPwdList test.txt
The following is my LN.count file: ... 0 1 0 2 0 3 0 4 1 5 2 6 3 7 4 8 0 9 0 10 0 11 ....
*Smacks head. Looks like the value in 4 is for length 3 passwords. For example if I modify the training set as:
tes test test1 test1 test12 test12 test12 test123 test123 test123 test123
I get the following in LN.count:
0 1 0 2 0 3 1 4 1 5 2 6 3 7 4 8 0 9