john Password length limits: reporting, documentation, potential increase

we use a minimum of 18 pwd length many have 32 or more. Please either update code to allow for up to 64 char passwords or please post on how I can update the source for windows to use up to 64.

Jun 18 '19 14:06 SquirrelAssassin

Maximum password lengths vary by JtR format and cracking mode. Currently supported are lengths up to 125. We might want to add documentation on the various length limits (right now, there's an option to print them out, but it's rather obscure to most users) and whether/how they may be avoided in specific cases (e.g., md5crypt format has a limit of 15, but md5crypt-long goes all the way up to 125, but we don't have this documented). We might want to keep this issue open as a documentation request, or just close it as end-user support request (which is off-topic given our use of GitHub issues). As to "update code to allow for" longer passwords, with our hundreds of supported formats and more to add this is a never-ending task - there's always that one new speed-optimized format still having a limit that's too low for some user - so there's no point in keeping a generic issue like that open in here.

@SquirrelAssassin For user support, please use the john-users mailing list, not GitHub issues. You'd want to post to that list information on what JtR formats and cracking modes you use and what length limit you think you bump into. https://www.openwall.com/lists/john-users/

Jun 18 '19 14:06 solardiz

Based on the issue @SquirrelAssassin opened with hashcat as well, mentioning length 27 there, this probably refers to NTLM hashes. As far as I'm aware, hashcat currently supports up to 256 by default (and incurs a performance penalty for that), unless the "-O" option is used. I doubt we care to support longer than 27 for NTLM enough to bother implementing that soon, if at all - that's not a limit as low as md5crypt's 15 was.

Jun 18 '19 15:06 solardiz

Yeah 15 is a bit short while 27 is plenty. The biggest threat against length 15 with md5crypt is UTF-8 - even a fairly short password can become way longer than 15 bytes.

Jun 19 '19 01:06 magnumripper

I am puzzled by these printing 81:

$ ./john --list=format-details --format=nt
NT	81	12	96	0002000f	43	MD4 128/128 AVX 4x3			0x107	16	0		0	b7e4b9022cd45f275334bbdb83bb5be5
$ ./john --list=format-details --format=nt-opencl
NT-opencl	81	1	1	0042002f	44	MD4 OpenCL			0x107	16	0		0	8846f7eaee8fb117ad06bdd830b7586c

whereas --list=format-all-details only includes:

Max. password length                 27
[...]
 Converts internally to UTF-16/UCS-2 yes

and no mention of 81 anywhere. I think we used to print a range of values for single-byte vs. multi-byte chars? Wasn't that more correct and more useful than the current output?

Jun 19 '19 09:06 solardiz

We actually do "print a range of values for single-byte vs. multi-byte chars" e.g. for md5crypt, but apparently not for NTLM, perhaps precisely because it "Converts internally to UTF-16/UCS-2". So the "27" is probably correct, for any chars, but the "81" is probably wrong (and should be fixed as a bug). Even if it were "bytes" rather than "chars", our NTLM code isn't capable of 81 since NTLM works with 2-byte chars max (not 3-byte). Right?

Jun 19 '19 09:06 solardiz

Yep moved over to hash.

Jun 19 '19 12:06 SquirrelAssassin

@SquirrelAssassin It isn't up to you to decide when to close the issue - we're now using it to discuss what we can do better on this topic in general, and I've just pointed out what I think is a (maximum length reporting) bug for us to fix. Providing user support on your one specific request would have been off-topic for our use of GitHub issues anyway, so the fact that you no longer need support on this is irrelevant to this issue's status.

Jun 19 '19 12:06 solardiz

Yes sir!

Jun 19 '19 12:06 SquirrelAssassin

If I opened a rabbit hole y’all can go here to explain the 81. Makes sense once ya read it.

https://www.notsosecure.com/maximum-password-length-reached/

Jun 19 '19 12:06 SquirrelAssassin

Thanks. Good information at that link, but it doesn't really explain the 81 beyond the obvious, and doesn't convince me it's not a bug for us to fix. I think the correct range of maximums for NTLM is 27 to 54 if we talk bytes, and just 27 if we talk chars. We should probably report 27 in place of 81.

Jun 19 '19 13:06 solardiz

Even if it were "bytes" rather than "chars", our NTLM code isn't capable of 81 since NTLM works with 2-byte chars max (not 3-byte). Right?

Not sure what you mean. 3-byte UTF-8 covers UTF-16 unless my memory fails me.

Jun 20 '19 01:06 magnumripper

3-byte UTF-8 covers UTF-16 unless my memory fails me.

Hmm, you're right. So we're in fact able to process a string of up to 81 bytes if it consists solely of 3-byte UTF-8 characters. The question then is whether this is what we want to report there. I find it weird that the brief output says 81, but the detailed output doesn't say 81 anywhere. If we want to report this, then perhaps it should be in detailed output (as well) and with some reasonable wording.

Jun 20 '19 09:06 solardiz

I think we report it very thoroughly in the log file (Hash type: sha512crypt-opencl, crypt(3) $6$ (min-len 0, max-len 7 [worst case UTF-8] to 23 [ASCII])) so maybe similar code should be used for --list=format*details.

Jun 20 '19 22:06 magnumripper

Hmmm according to that article, --list=format-all-details should list it just like the log file already. So what was the problem here? The shorter --list=format-details? We can't put anything but a single number in that field.

Jun 20 '19 22:06 magnumripper

@solardiz were you using the same encodings in your examples of --list=format-all-details and --list=format-details? If so, we have a bug. It should just print 27 unless UTF-8 is actually the selected input encoding but otherwise it should give the long silly story with "worst case" and so on.

Jun 20 '19 22:06 magnumripper

On another note, if we're actually talking worst-case UTF-8, the max-len 7 [worst case UTF-8] to 23 [ASCII] for sha512crypt isn't correct. Worst-case is really 4 bytes per character, so 23 bytes give us (using just 20 bytes because the remaining three doesn't cut it) just 5 characters. But to reach that level of worst case, the user had to pick a password containing, like, fictional languages invented by the mighty J. R. Tolkien or something equally remote.

For Microsoft formats, it get's trickier. The UTF-16 (as opposed to UTF-32) limit is 27. If we use 4x8-bit UTF-8, they will result in 2x16-bit UTF-16 each, using surrogates. So we can do at most 13 of them (with one remaining as loss), using 4x13 (52) bytes - or we can do 27 of them using up to 3x27 (81) bytes. But as far as I know, we do report 81 bytes / 27 characters (and BTW the surrogate support is even configurable at build time for the faster formats like NT... this is really complicated stuff!).

Now possibly you may start to understand why Jumbo isn't super clear about this in all situations: It simply is a heck of a complicated matter!

To print really really correct messages, we'd need to drop the FMT_UNICODE format flag and replace it with three: FMT_UCS2, FMT_UTF16 and FMT_UTF32. We should actually do this, sooner or later. UCS-2 is basically UTF-16 without support for surrogates BTW (you can't write a Tolkien invented language in UCS-2).

Last time I checked (years ago me thinks) no emoticon was using 4-byte UTF-8. I think it's reasonable to assume a user could pick emoticons in a (non login) password, but not likely characters from a fictional languauge. In fact, for the life of me I can't understand why fictional languages even appear in Unicode at all. Perhaps they had a Star Trek fan in that working party.

Jun 20 '19 23:06 magnumripper

But to reach that level of worst case

So now we have levels of worst case? Perhaps we should add a Jumbo option --worst-case-level=N 😂😂😂

Jun 20 '19 23:06 magnumripper

were you using the same encodings in your examples of --list=format-all-details and --list=format-details?

I think so - I didn't specify any encoding with either command.

Jun 21 '19 10:06 solardiz

We really need some user-friendly reporting of lengths limits affecting actual cracking. Because it is hard to realize that john cannot crack something due to its limits. It is harder to realize the problem when some hashes in a set get cracked and only some fail.

If something does not get cracked by attack, user would think that attack is not good and does not contain password. Then user would move onto other attacks and waste time. (The same applies to other uncrackable hashes: a bug in a format, a choice of inappropriate format with informal limits (e.g. dynamic_1551 vs dynamic_1552), having set of hashes with mix of indistinguishable hash types (e.g. bare raw-md5 vs iterated variants), or to some extent sporadic misses on ztex boards.)

Length limit hit really hard with test hashes (nt) for CMIYC 2021. We had 4 active crackers together then. All of us had non-zero experience and were aware of length limits. But we have not been cracking often enough to keep in mind practical consequences of the limits. We cracked some of the nt hashes. Then we moved onto passphrases for the rest of the hashes. We got correct wordlist quickly, but it did not give password. We did not have an idea or even a guess that the length limit was preventing us from getting the crack. So we started to get additional wordlists and create more sophisticated attacks. We spent a whole week this way. Then one of us tried again the same first wordlist with hashcat without -O and we got the crack. So we switched onto dynamic=md4(utf16($p)). (hashcat was used from the start, but -O was in place always.)

(Off-topic consequence: bugs resulting into uncrackable hashes are serious problems causing waste of time and cycles.)

I guess it might be helpful to show number/percent of candidates rejected by length in status line.

May 10 '22 11:05 AlekseyCherepanov

User with some assumptions might try to use --max-length= option to increase the limit (based on real story). There is a check for that in john. But for nt, 81 is used for utf-8. Maybe a warning should be emitted when value for --max-length= is above the worst case limit.

$ echo '$NT$7bb6ff238316dcaa5adc80c9c4561114' > nt.pw

$ # á x 32
$ echo 'áááááááááááááááááááááááááááááááá' | ./john/run/john nt.pw --stdin --encoding=cp1252 --max-length=44
Can't set max length larger than 27 for NT format

$ echo 'áááááááááááááááááááááááááááááááá' | ./john/run/john nt.pw --stdin --max-length=44
[...]
0g 0:00:00:00  0g/s 0p/s 0c/s 0C/s
Session completed. 

$ # á x 32, got truncated to 27
$ echo 'áááááááááááááááááááááááááááááááá' | ./john/run/john nt.pw --stdin
[...]
ááááááááááááááááááááááááááá (?)     
1g 0:00:00:00  5.882g/s 5.882p/s 5.882c/s 5.882C/s ááááááááááááááááááááááááááá
Session completed. 

$ ./john/run/john --format=nt --list=format-all-details
[...]
 Truncates at max. length            no
[...]

$ echo 'áááááááááááááááááááááááááááááááá' | ./john/run/john nt.pw --stdin --max-length=82
Can't set max length larger than 81 for NT format

There is a bug with truncation vs reported info.

I guess it might be helpful to show number/percent of candidates rejected by length in status line.

The same might be needed for truncated candidates.

May 10 '22 11:05 AlekseyCherepanov

I extracted the problem with truncation in NT as #5144.

I guess it might be helpful to show number/percent of candidates rejected by length in status line.

NT truncates some candidates inside the format. So it might be a problem implementing such stats.

May 21 '22 14:05 AlekseyCherepanov

john john copied to clipboard

Password length limits: reporting, documentation, potential increase

john
john copied to clipboard