john
john copied to clipboard
Password length limits: reporting, documentation, potential increase
we use a minimum of 18 pwd length many have 32 or more. Please either update code to allow for up to 64 char passwords or please post on how I can update the source for windows to use up to 64.
Maximum password lengths vary by JtR format and cracking mode. Currently supported are lengths up to 125. We might want to add documentation on the various length limits (right now, there's an option to print them out, but it's rather obscure to most users) and whether/how they may be avoided in specific cases (e.g., md5crypt format has a limit of 15, but md5crypt-long goes all the way up to 125, but we don't have this documented). We might want to keep this issue open as a documentation request, or just close it as end-user support request (which is off-topic given our use of GitHub issues). As to "update code to allow for" longer passwords, with our hundreds of supported formats and more to add this is a never-ending task - there's always that one new speed-optimized format still having a limit that's too low for some user - so there's no point in keeping a generic issue like that open in here.
@SquirrelAssassin For user support, please use the john-users mailing list, not GitHub issues. You'd want to post to that list information on what JtR formats and cracking modes you use and what length limit you think you bump into. https://www.openwall.com/lists/john-users/
Based on the issue @SquirrelAssassin opened with hashcat as well, mentioning length 27 there, this probably refers to NTLM hashes. As far as I'm aware, hashcat currently supports up to 256 by default (and incurs a performance penalty for that), unless the "-O" option is used. I doubt we care to support longer than 27 for NTLM enough to bother implementing that soon, if at all - that's not a limit as low as md5crypt's 15 was.
Yeah 15 is a bit short while 27 is plenty. The biggest threat against length 15 with md5crypt is UTF-8 - even a fairly short password can become way longer than 15 bytes.
I am puzzled by these printing 81:
$ ./john --list=format-details --format=nt
NT 81 12 96 0002000f 43 MD4 128/128 AVX 4x3 0x107 16 0 0 b7e4b9022cd45f275334bbdb83bb5be5
$ ./john --list=format-details --format=nt-opencl
NT-opencl 81 1 1 0042002f 44 MD4 OpenCL 0x107 16 0 0 8846f7eaee8fb117ad06bdd830b7586c
whereas --list=format-all-details
only includes:
Max. password length 27
[...]
Converts internally to UTF-16/UCS-2 yes
and no mention of 81 anywhere. I think we used to print a range of values for single-byte vs. multi-byte chars? Wasn't that more correct and more useful than the current output?
We actually do "print a range of values for single-byte vs. multi-byte chars" e.g. for md5crypt, but apparently not for NTLM, perhaps precisely because it "Converts internally to UTF-16/UCS-2". So the "27" is probably correct, for any chars, but the "81" is probably wrong (and should be fixed as a bug). Even if it were "bytes" rather than "chars", our NTLM code isn't capable of 81 since NTLM works with 2-byte chars max (not 3-byte). Right?
Yep moved over to hash.
@SquirrelAssassin It isn't up to you to decide when to close the issue - we're now using it to discuss what we can do better on this topic in general, and I've just pointed out what I think is a (maximum length reporting) bug for us to fix. Providing user support on your one specific request would have been off-topic for our use of GitHub issues anyway, so the fact that you no longer need support on this is irrelevant to this issue's status.
Yes sir!
If I opened a rabbit hole yโall can go here to explain the 81. Makes sense once ya read it.
https://www.notsosecure.com/maximum-password-length-reached/
Thanks. Good information at that link, but it doesn't really explain the 81 beyond the obvious, and doesn't convince me it's not a bug for us to fix. I think the correct range of maximums for NTLM is 27 to 54 if we talk bytes, and just 27 if we talk chars. We should probably report 27 in place of 81.
Even if it were "bytes" rather than "chars", our NTLM code isn't capable of 81 since NTLM works with 2-byte chars max (not 3-byte). Right?
Not sure what you mean. 3-byte UTF-8 covers UTF-16 unless my memory fails me.
3-byte UTF-8 covers UTF-16 unless my memory fails me.
Hmm, you're right. So we're in fact able to process a string of up to 81 bytes if it consists solely of 3-byte UTF-8 characters. The question then is whether this is what we want to report there. I find it weird that the brief output says 81, but the detailed output doesn't say 81 anywhere. If we want to report this, then perhaps it should be in detailed output (as well) and with some reasonable wording.
I think we report it very thoroughly in the log file (Hash type: sha512crypt-opencl, crypt(3) $6$ (min-len 0, max-len 7 [worst case UTF-8] to 23 [ASCII])
) so maybe similar code should be used for --list=format*details
.
Hmmm according to that article, --list=format-all-details
should list it just like the log file already. So what was the problem here? The shorter --list=format-details
? We can't put anything but a single number in that field.
@solardiz were you using the same encodings in your examples of --list=format-all-details
and --list=format-details
? If so, we have a bug. It should just print 27 unless UTF-8 is actually the selected input encoding but otherwise it should give the long silly story with "worst case" and so on.
On another note, if we're actually talking worst-case UTF-8, the max-len 7 [worst case UTF-8] to 23 [ASCII]
for sha512crypt isn't correct. Worst-case is really 4 bytes per character, so 23 bytes give us (using just 20 bytes because the remaining three doesn't cut it) just 5 characters. But to reach that level of worst case, the user had to pick a password containing, like, fictional languages invented by the mighty J. R. Tolkien or something equally remote.
For Microsoft formats, it get's trickier. The UTF-16 (as opposed to UTF-32) limit is 27. If we use 4x8-bit UTF-8, they will result in 2x16-bit UTF-16 each, using surrogates. So we can do at most 13 of them (with one remaining as loss), using 4x13 (52) bytes - or we can do 27 of them using up to 3x27 (81) bytes. But as far as I know, we do report 81 bytes / 27 characters (and BTW the surrogate support is even configurable at build time for the faster formats like NT... this is really complicated stuff!).
Now possibly you may start to understand why Jumbo isn't super clear about this in all situations: It simply is a heck of a complicated matter!
To print really really correct messages, we'd need to drop the FMT_UNICODE
format flag and replace it with three: FMT_UCS2
, FMT_UTF16
and FMT_UTF32
. We should actually do this, sooner or later. UCS-2 is basically UTF-16 without support for surrogates BTW (you can't write a Tolkien invented language in UCS-2).
Last time I checked (years ago me thinks) no emoticon was using 4-byte UTF-8. I think it's reasonable to assume a user could pick emoticons in a (non login) password, but not likely characters from a fictional languauge. In fact, for the life of me I can't understand why fictional languages even appear in Unicode at all. Perhaps they had a Star Trek fan in that working party.
But to reach that level of worst case
So now we have levels of worst case? Perhaps we should add a Jumbo option --worst-case-level=N
๐๐๐
were you using the same encodings in your examples of
--list=format-all-details
and--list=format-details
?
I think so - I didn't specify any encoding with either command.
We really need some user-friendly reporting of lengths limits affecting actual cracking. Because it is hard to realize that john cannot crack something due to its limits. It is harder to realize the problem when some hashes in a set get cracked and only some fail.
If something does not get cracked by attack, user would think that attack is not good and does not contain password. Then user would move onto other attacks and waste time. (The same applies to other uncrackable hashes: a bug in a format, a choice of inappropriate format with informal limits (e.g. dynamic_1551 vs dynamic_1552), having set of hashes with mix of indistinguishable hash types (e.g. bare raw-md5 vs iterated variants), or to some extent sporadic misses on ztex boards.)
Length limit hit really hard with test hashes (nt) for CMIYC 2021. We had 4 active crackers together then. All of us had non-zero experience and were aware of length limits. But we have not been cracking often enough to keep in mind practical consequences of the limits. We cracked some of the nt hashes. Then we moved onto passphrases for the rest of the hashes. We got correct wordlist quickly, but it did not give password. We did not have an idea or even a guess that the length limit was preventing us from getting the crack. So we started to get additional wordlists and create more sophisticated attacks. We spent a whole week this way. Then one of us tried again the same first wordlist with hashcat without -O
and we got the crack. So we switched onto dynamic=md4(utf16($p))
. (hashcat was used from the start, but -O
was in place always.)
(Off-topic consequence: bugs resulting into uncrackable hashes are serious problems causing waste of time and cycles.)
I guess it might be helpful to show number/percent of candidates rejected by length in status line.
User with some assumptions might try to use --max-length=
option to increase the limit (based on real story). There is a check for that in john. But for nt, 81 is used for utf-8. Maybe a warning should be emitted when value for --max-length=
is above the worst case limit.
$ echo '$NT$7bb6ff238316dcaa5adc80c9c4561114' > nt.pw
$ # รก x 32
$ echo 'รกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรก' | ./john/run/john nt.pw --stdin --encoding=cp1252 --max-length=44
Can't set max length larger than 27 for NT format
$ echo 'รกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรก' | ./john/run/john nt.pw --stdin --max-length=44
[...]
0g 0:00:00:00 0g/s 0p/s 0c/s 0C/s
Session completed.
$ # รก x 32, got truncated to 27
$ echo 'รกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรก' | ./john/run/john nt.pw --stdin
[...]
รกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรก (?)
1g 0:00:00:00 5.882g/s 5.882p/s 5.882c/s 5.882C/s รกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรก
Session completed.
$ ./john/run/john --format=nt --list=format-all-details
[...]
Truncates at max. length no
[...]
$ echo 'รกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรกรก' | ./john/run/john nt.pw --stdin --max-length=82
Can't set max length larger than 81 for NT format
There is a bug with truncation vs reported info.
I guess it might be helpful to show number/percent of candidates rejected by length in status line.
The same might be needed for truncated candidates.
I extracted the problem with truncation in NT as #5144.
I guess it might be helpful to show number/percent of candidates rejected by length in status line.
NT truncates some candidates inside the format. So it might be a problem implementing such stats.