john icon indicating copy to clipboard operation
john copied to clipboard

Splitting brute force keyspace

Open sectroyer opened this issue 5 years ago • 67 comments

I have small feature request (unless it's already implemented?). I use both john and hashcat in custom solution for distributing password cracking. I have pretty good results (all nodes finish with similar ETA) by splitting dictionary by number of lines (as a percentage of nodes computational power). However I have an issue while splitting keyspace in brute attack. Itried using simple letters (for example one node a-k and other l-z) but it's pretty hard to implement and often unreliable. I noticed hashcat has some options for splitting the keyspace itself. It would be nice to have something like that in john. Relevant link: https://hashcat.net/forum/thread-3386.html

sectroyer avatar Feb 25 '21 08:02 sectroyer

@sectroyer We have the --node option, usable with any cracking mode (although under the hood it's implemented with separate code per cracking mode). If your nodes aren't the same speed, you can use a higher virtual node count and node number ranges. Please refer to doc/OPTIONS. Does this satisfy your needs?

solardiz avatar Feb 25 '21 12:02 solardiz

Anyhow, there isn't anything in what you wrote that we'd currently want to track as an issue. So I'll close this. Please feel free to add further comments anyway, or to move this to the john-users mailing list (for a discussion on how to use the existing functionality best).

solardiz avatar Feb 25 '21 12:02 solardiz

I checked --node option and it is usable for what I want to use it. It's even easier to use than manually calculating keyspace number like in hashcat. For now I was calculating node "percentage power" which (if I understand correctly) I could use with nodes set to 100 and treat percentages I have calculated as ranges you are describing. This would solve my issue with john nodes however for some of computation I want to use hashcat nodes and from what I can see they use different approach. Unless there is a way to "translate" john node option to hashcat values ? Then I could use john to split the work even with those nodes. I understand you have your ways and to be honest I prefer this node option from hashcat ones as I don't have to calculate number jsut give john "the percentage" and voila. However I also noticed you try to have at least some level of compatibility (like enabling hashcat mode for rules) with hashcat so making small changes to enable such compatibility might help :) Just as a side note I know some people would also appreciate similar options for dictionary based attack. In my setup personally wouldn't use them anyway as I split the work by splitting dict into chunks with line numbers corresponding to computational power of the node. However for example I heard that one of the reasons hashtopolis isn't adding support for john nodes is lack of those options: -s, --skip | Num | Skip X words from the start | -s 1000000 -l, --limit | Num | Limit X words from the start + skipped words | -l 1000000

I suspect implementing similar options (or in case of brute outputting "hashcat compatible values") would help many people especially since if I understand correctly part of those things (like skipping dict lines via restore file or node option for brute) you already support :)

sectroyer avatar Feb 25 '21 16:02 sectroyer

It should be silly simple to implement JtR in Hashtopus (much simpler than hashcat), they are just too blinded with hashcat to see what course to take. I actually started to implement an alternative --node option for hashcat some time ago, because I despise their current options, but it never ended up in a PR. It should be fairly easy to implement though.

magnumripper avatar Feb 25 '21 17:02 magnumripper

Then again, our --node option has its limits, it's really not meant for fine-coursed stuff like -node=1355/1000000. It depends on what mode is used though.

magnumripper avatar Feb 25 '21 17:02 magnumripper

Well I only hope to use it as -node=1-13/100, -node=14-20/100, etc. As "percentage power" so I don't think there will be an issue with that. Also as I explained I have very good results by splitting dictionary in a similar matter however still have issues with brute:( If I could use john to do -node=83-100/100 and use --stdout (or some other flag) to output "hashcat numbers" I could support all nodes using "john semantic". I understand you prefer your way of splitting task and I can agree with your decision as doing keyspace calculations for every task is pointless. Of course I am doing similar thing in dictionary based attack but at least I don't have to send whole (many gig) dictionary to every node (if I understand correctly that's what hastopolis does and uses -s and -l to specify lines within whole dictionary). Splitting tasks by power between nodes is much more "user friendly" and now I think in my system I will switch to node tough I will have to still split keyspace "by letters" to john nodes (and use percentage power) and split second part of keyspace using haschat "magical numbers". Better than I was having till now but if you don't want to support "hashcat options" (which I can understand) it would be nice to "spit out" those numbers for specific use cases. You have a way of printing generated dicts by rules and brute force which can be used by any node anyway so if it's possible(?) to add "--spit-out-hashcat-numbers" it would make my life easier 😄 Unless their keyspace numbers are NOT compatible with how you currently handle splitting by node parameter?

sectroyer avatar Feb 25 '21 18:02 sectroyer

Unless their keyspace numbers are NOT compatible with how you currently handle splitting by node parameter?

They almost certainly are not.

BTW, you mention "brute force" in multiple places, but to me this is something undefined, a non-term I refrain from using. We have several cracking modes. hashcat has several attack modes. In a sense, all of those (even wordlist-based modes) are "brute force": they test candidate passwords. hashcat has a mode called "brute force", which is actually mask mode in our terms. hashcat's variation of this mode tries to be smart by default by using Markov chains, and hashcat has an option to dumb it down (but you shouldn't) to what its help message calls "classic brute-force" (but I wouldn't).

solardiz avatar Feb 25 '21 22:02 solardiz

Better than I was having till now but if you don't want to support "hashcat options" (which I can understand) it would be nice to "spit out" those numbers for specific use cases. You have a way of printing generated dicts by rules and brute force which can be used by any node anyway so if it's possible(?) to add "--spit-out-hashcat-numbers" it would make my life easier 😄 Unless their keyspace numbers are NOT compatible with how you currently handle splitting by node parameter?

AFAIK and if things hasn't changed, hashcat keyspace numbers are not often counted in "passwords" but in "blocks" of unknown and varying size. You can get a keyspace figure from hashcat of eg. 363,839 for an actual keyspace of 1,117,714,499 becuase the keyspace was counted in 3K blocks in that very occasion. Running the very same keyspace calculation on some other rig (eg. with other GPU's) could give you some other keyspace figure because the block size is larger or smaller (and I think could depend on autotune or other things). This all is quite natural and without it being this way you couldn't even handle really huge keyspaces but as a result, there's no way an external tool can spit out hashcat keyspace numbers. Or did I misunderstand what you asked for? Also, again please note that I may be wrong and/or things might have changed in hashcat.

magnumripper avatar Feb 26 '21 11:02 magnumripper

@solardiz yes I meant what (in john terms) is called mask mode. I just got used to calling it "a brute" for short ;) @magnumripper yes, that is exactly what I was asking for. Tough shame there is no "standard" for it. I played around on few machines that are part of my test environment and I definitely see benefits of mixing node types. In my test environment around half (depending on algo) of hashing power comes from cpus and john perform significantly better than hc on those. I also was surprised that for dictionary based attack (usualy with rule) john outperforms hc 2-3x however for mask based attack I still get better results(especially with multi gpu setup) with hc. Such results encourage me to try to "combine" all of them into single "system". I want to think about it from basics and to not have any wrong assumptions. Also for me the more heterogenous system is the more interesting it gets 😄

sectroyer avatar Feb 26 '21 14:02 sectroyer

Okay I was able to implement what we discussed and it looks to be working. However on fastest john node I don't see ETA and percentage progress after few first chars (I am doing mask attack 1-8) disappears. Am I doing something wrong or is it a bug? Here is how I split (first by first letter, than by --node):

Total power percentage of john nodes: 41.11% Total john keyspace: w-9

john1 power: 7.57% node: 1-757/4111 john2 power: 7.15% node: 758-1472/4111 john3 power: 26.39% node: 1473-4111/4111

Log looks like this: 0g 0:00:00:00 0.00% (5) (ETA: 10:26:59) 0g/s 13107Kp/s 65536Kc/s 65536KC/s zjxX..0syWh 0g 0:00:00:00 0.00% (6) (ETA: 10:26:59) 0g/s 24296Kp/s 121481Kc/s 121481KC/s xsmG7..1u5Ewh 0g 0:00:00:05 (6) 0g/s 21479Kp/s 107395Kc/s 107395KC/s 7m7K2j..zy0Q2j 0g 0:00:00:15 (7) 0g/s 23288Kp/s 116443Kc/s 116443KC/s 1jvW77..4erFiwh 0g 0:00:00:43 (7) 0g/s 24041Kp/s 120205Kc/s 120205KC/s 896Uf2u..041Zf2u 0g 0:00:01:21 (7) 0g/s 24170Kp/s 120869Kc/s 120869KC/s 5emGsbk..xneAtbk 0g 0:00:01:59 (7) 0g/s 24257Kp/s 121288Kc/s 121288KC/s zbgMoip..4gmKoip

sectroyer avatar Mar 01 '21 09:03 sectroyer

@sectroyer The disappearing progress percentage and ETA could be a bug or at least a shortcoming. If you can share a reduced and reproducible test case for this, then please open a separate issue for it. No real password hashes in the test case, please (you can generate similar fake ones). Thanks!

solardiz avatar Mar 01 '21 13:03 solardiz

@solardiz I have submitted bug report here: https://github.com/openwall/john/issues/4579 with "test" pw ;) Still not sure if my "fix" doesn't break anything else (in my current setup) will do more tests and report.

BTW I would like to ask something about mask mode. Let's assume (for the sake of simplification ;)) that I want to crack for letter pw which has one upper letter, one digit, one symbol and one lower letter. Simple mask might look like this: ?u?d?s?l However that's only one possible configuration. In reality it will be (if I calculated correctly) 16 different configurations. I would like to specify that I am looking for pw of that characteristic. Or for example an ?l?d 8 character long which (at most) has 2-3 upper letters. I think the only way to do that now is to prepare "16" (like in simplified example) different masks and perform "16" different attacks. This makes not only hard to "maintain" but also hard to do any eta and such. Is there any way to simplify that?

sectroyer avatar Mar 01 '21 16:03 sectroyer

@sectroyer Thank you for creating that other issue.

There isn't currently a good way to express those masks as one attack.

BTW, for attacks on a small number of descrypt hashes you'll likely find --fork significantly faster than OpenMP. Unfortunately, we currently require the number of forked processes to exactly match the number of nodes in the range. So you can't have e.g. --fork=32 --node=1473-4111/4111. You can have e.g. --fork=32 --node=200-231/333. This is a shortcoming we might want to address, but currently don't plan to.

solardiz avatar Mar 01 '21 17:03 solardiz

@solardiz Okay I will do test fork option tough with openmp i get one status for whole session which is pretty useful. Will see how big impact will it have in my specific calculations and report back :) Tough since it has such big correlation with --node I don't know if I will be able to adopt it efficiently. Node "john3" has 32 cores so it would require 32 nodes whereas "john2" has 8 cores but cracking speed ~7.15% of total speed. If I would translate it to same number of nodes it would get 6.59% (26.39/4) with "john1" the difference becomes even more significant. Also some nodes (with same number of cores) have avx, some avx2 so I need to adjust the the number of nodes to their cracking speed and not to number of cores themselves... Well will have to check if advantages overcome disadvantages ;) Anyway just to explain I am MUCH MORE interested in adjusting the system itself and evenly distributing tasks (as that's main topic of my research) than cracking specific hashes 😄

sectroyer avatar Mar 01 '21 18:03 sectroyer

There isn't currently a good way to express those masks as one attack.

I think providing some way of expressing such "constrains" would be good idea. Especially since for computer "configuring" such attack isn't really complex tax but for humans/scripts it really hard to grasp. In real world scenario with up to let's say 8 characters long pw it's pretty common to have 1-3 numbers, 1-3 special chars and rest we can assume would be lowercase letters and maybe 1-2 uppercase letters. At least those are results I have after analysing some IoT hashes I played around. I am not a mathematician but such keyspace (at first glance) should be significantly smaller than doing full ?l?d?u?s mask attack. But even if my "first glance estimation" is incorrect there are still use cases where it might be useful and much faster than specifying "full mask". Now just to check if one char of 8 characters long password is an uppercase letter I have to run 8 separate sessions :)

sectroyer avatar Mar 01 '21 18:03 sectroyer

@sectroyer Now we're deep into topics that would be much better discussed on john-users than here.

I am MUCH MORE interested in adjusting the system itself and evenly distributing tasks (as that's main topic of my research)

If so, you could want to experiment with very different hash types, and see different strategies work best accordingly.

some nodes (with same number of cores) have avx, some avx2

For AVX vs. AVX2, you could double the number of forks on the AVX boxes (overbooking them, which would hurt performance really bad with OpenMP but is tolerable with fork). Or halve them on the AVX2 ones, considering those "cores" aren't real (you probably have 2 hardware threads per core). You can try it both ways.

I am not a mathematician but such keyspace (at first glance) should be significantly smaller

See https://openwall.info/wiki/john/policy

just to check if one char of 8 characters long password is an uppercase letter I have to run 8 separate sessions

You can use a revision of [List.External:Policy] instead, but only for much slower hashes or/and higher different salt count.

solardiz avatar Mar 01 '21 18:03 solardiz

At least those are results I have after analysing some IoT hashes I played around.

It's a fallacy to focus on password patterns you see the most without considering the size of keyspaces for those patterns vs. simpler patterns that maybe don't occur so frequently. You do need to be a little bit of a mathematician there to calculate and sort the ratios.

solardiz avatar Mar 01 '21 18:03 solardiz

@solardiz I considering such "based on patterns" attack when I already did all dicts I could find with best64 rule (btw is there any other rule also worth checking?), full 8 chars of ?l?d and some other stuff. There are still some things I wanna try but such "pattern based" attack is last thing that won't take "ages" :) I just noticed that some of the "harder" to crack hashes on IoT that I could crack were pretty "random" so those "patterns" are only thing I could come up :)

sectroyer avatar Mar 01 '21 20:03 sectroyer

@sectroyer More john-users material here. Please use the mailing list next time you bring something like this up.

best64 rule (btw is there any other rule also worth checking?)

Of course. --rules=jumbo is a must to run, at least with a tiny wordlist (such as RockYou top 10k to 1M). With a tiny enough wordlist or/and a fast hash, you'd also run --rules=all.

hashes on IoT that I could crack were pretty "random" so those "patterns" are only thing I could come up :)

You can instead use "incremental" mode and let it do the magic for you. It can take forever to complete, but that's actually an advantage: you don't have to start new attacks, you can stop any time you like. You'll likely find that it cracks more hashes than mask mode over the same amount of time. Mask mode is generally for cases when you know a part of a partially forgotten password or need to exclude certain patterns from further consideration. You can also use it along with other modes e.g. to split the workload by first letter - e.g., --incremental --mask='a?w' will prepend whatever incremental mode generates by letter "a" (hurting its efficiency somewhat, though, since the .chr files encode frequencies that assume character positions stay intact).

solardiz avatar Mar 01 '21 20:03 solardiz

@sectroyer Now we're deep into topics that would be much better discussed on john-users than here.

Well for now I have to little data to really "discuss" it. I wanted to mention that just to explain my attitude towards this matter :)

If so, you could want to experiment with very different hash types, and see different strategies work best accordingly.

Yes I already noticed that and still playing around. For example on one node I have poor ("just for display") nvidia graphic card which is completely useless (using it for cracking reduces john speed more than I could gain) for des hashes however doubles nodes speed when used with md5. I just started with des as it was fastest to gather some data.

some nodes (with same number of cores) have avx, some avx2

For AVX vs. AVX2, you could double the number of forks on the AVX boxes (overbooking them, which would hurt performance really bad with OpenMP but is tolerable with fork). Or halve them on the AVX2 ones, considering those "cores" aren't real (you probably have 2 hardware threads per core). You can try it both ways.

Yeah will try different settings and report back once I get some more results :)

I am not a mathematician but such keyspace (at first glance) should be significantly smaller

See https://openwall.info/wiki/john/policy

Great info!! If you don't mind I would like to use as a reference if I come up with something worth publishing.

just to check if one char of 8 characters long password is an uppercase letter I have to run 8 separate sessions

You can use a revision of [List.External:Policy] instead, but only for much slower hashes or/and higher different salt count.

Sorry for dumb question but just to clarify I set up mask attack and john uses Policy to filter out only "matching pws"? Or is it something you put in john code and recompile?

sectroyer avatar Mar 01 '21 21:03 sectroyer

AFAIK and if things hasn't changed, hashcat keyspace numbers are not often counted in "passwords" but in "blocks" of unknown and varying size. You can get a keyspace figure from hashcat of eg. 363,839 for an actual keyspace of 1,117,714,499 becuase the keyspace was counted in 3K blocks in that very occasion. Running the very same keyspace calculation on some other rig (eg. with other GPU's) could give you some other keyspace figure because the block size is larger or smaller (and I think could depend on autotune or other things). This all is quite natural and without it being this way you couldn't even handle really huge keyspaces but as a result, there's no way an external tool can spit out hashcat keyspace numbers. Or did I misunderstand what you asked for? Also, again please note that I may be wrong and/or things might have changed in hashcat.

I did check on two completely different systems (both hetergenous with two gpus) and the result of command was same:

./hashcat.bin -m 1500 -a3 -1 ?l?d -2 'abcdefghijklmn' '?u?1?1?1?1?1?1?u' --keyspace                                                
43670016

Just wanted to leave it if needed for reference or something.

sectroyer avatar Mar 10 '21 15:03 sectroyer

That proves my point, since the actual keyspace you defined in that mask is 1,471,504,859,136. The figure you got as answer is - for whatever reason - divided by (or "in blocks of") 33,696. So where does that figure come from? I have no idea, it's not an even number even in binary terms.

magnumripper avatar Mar 10 '21 15:03 magnumripper

OK, so 33,696 is 26x36x36. I bet with other masks you may get other block sizes.

magnumripper avatar Mar 10 '21 16:03 magnumripper

Only thing I need is to have consistent keyspace size across single mask to split the task so that's doable now. However will keep your info in mind to not rely on it (I will try to compute it each run). Tough I was surprised that on two completely different systems this number was in fact consistent, they are even different brands so there has to be "some logic" behind it :) I wonder if this will keep consistent and I will keep on getting same values on different systems...

sectroyer avatar Mar 10 '21 16:03 sectroyer

I noticed I get different keyspace size for different algorithm (same command form md5crypt gives 56596340736). However this isn't any longer main issue. Now I had to 🤦 after seeing this error message: Use of --skip/--limit is not supported with --increment or mask files.

After thinking about it for more than 5 seconds it's understandable but this doesn't change the fact that at this point hc lacks ANY (serious) distribute system support. If this --skip/--limit is really what hashtopolis uses internally to divide task nodes it's plain stupid. Computing this for every single password length and creating separate session... @magnumripper now I really want your "hashcat node patches" 😄 Will leave it for next day and try to figure out it there is "anything else" that I am missing but I think I am starting to understand why hashtopolis doesn't show ETA for nodes, doesn't support john etc. This is really big mess and probably tons of "hc specific hacks". There should be a way to simply express (like you do with --node) nodes power for single session. I understand I might have to set something up between different sessions however not such simple usecase... If I won't find anything I will definitely report an issue.

sectroyer avatar Mar 10 '21 19:03 sectroyer

@magnumripper now I really want your "hashcat node patches" 😄

Unfortunately if I ever push such patches, they'll [likely] only add the alternative syntax of --node=m[-n]/total and would merely translate that into whatever --skip and --limit would be correct at the time. As long as there are restrictions against --increment and other things, it will still be a problem.

magnumripper avatar Mar 10 '21 19:03 magnumripper

I noticed I get different keyspace size for different algorithm (same command form md5crypt gives 56596340736)

That makes sense as I believe what we're actually seeing here is sort-of their version of our "internal mask multiplier", which does vary a lot in JtR too, from one to tens of thousands varying with format and/or GPU power - BUT we never expose it in terms of node numbers or keyspace. I'm not saying they did anything wrong, just that our syntax is a whole lot easier to work with IMHO.

magnumripper avatar Mar 10 '21 19:03 magnumripper

Also, in the hashcat world the --increment limitation is not by far as bad as it would be for JtR: Our incremental mode will, for example, try weaker candidates of length 9 earlier than less weak ones of length 4. Forcing the user to do one length at a time would be a seriously bad thing here.

magnumripper avatar Mar 10 '21 20:03 magnumripper

Well I just noticed I made a "typo" in mask. I didn't put 2 (which I use to split between hc and john) anywhere in mask and keyspace size was same. It looks even when I do this: ./hashcat.bin -m 1500 -a3 -1 ?l?d -2 'abcdefghijklmn' '?u?2?2?1?1?1?1?u' --keyspace I also get 43670016. Not to mention that slower gpu node displays (with 10% already done) this eta: Time.Estimated...: Next Big Bang, (50198 years, 198 days) 🤦 🤣

Forcing the user to do one length at a time would be a seriously bad thing here. If I understand correctly (and am not missing anything) this is exactly what you have to do to split hc mask based tasks and I suspect this is what hashtopolis does... I made a quick fix that temporarily "resolved" this issue for me but ATM I can't think of any approach that would be usable in a long run.

Unfortunately if I ever push such patches, they'll [likely] only add the alternative syntax of --node=m[-n]/total and would merely translate that into whatever --skip and --limit would be correct at the time. As long as there are restrictions against --increment and other things, it will still be a problem.

If they would "translate" those numbers for each pw len to correct --skip/--limit values it would a hack but still something that can be used. We can argue that one approach is better than other and in such discussion I would be on johns side however even such hack is better than not having anything (serious) and this is how it looks for me right now...

sectroyer avatar Mar 10 '21 20:03 sectroyer

Well I just noticed I made a "typo" in mask. I didn't put 2 (which I use to split between hc and john) anywhere in mask and keyspace size was same. It looks even when I do this: ./hashcat.bin -m 1500 -a3 -1 ?l?d -2 'abcdefghijklmn' '?u?2?2?1?1?1?1?u' --keyspace I also get 43670016.

This is exactly how JtR internal mask works and I expect hashcat to be similar: We decide on a "target gpu-side mask multiplier" and then try to get close to it using the factors at hand (in your case 26x for ?u, 36x for ?1 and 14x for ?2) but we may pick them anywhere in the mask if needed. Sometimes even that doesn't end up very well at least in JtR, but it's what we do.

Not to mention that slower gpu node displays (with 10% already done) this eta: Time.Estimated...: Next Big Bang, (50198 years, 198 days) 🤦 🤣

I predict next big bang some time later than that, but WW3 will predate it for sure and there might not be much of distinction between them for things like us 🙄

Forcing the user to do one length at a time would be a seriously bad thing here. If I understand correctly (and am not missing anything) this is exactly what you have to do to split hc mask based tasks and I suspect this is what hashtopolis does... I made a quick fix that temporarily "resolved" this issue for me but ATM I can't think of any approach that would be usable in a long run.

Unfortunately if I ever push such patches, they'll [likely] only add the alternative syntax of --node=m[-n]/total and would merely translate that into whatever --skip and --limit would be correct at the time. As long as there are restrictions against --increment and other things, it will still be a problem.

If they would "translate" those numbers for each pw len to correct --skip/--limit values it would a hack but still something that can be used. We can argue that one approach is better than other and in such discussion I would be on johns side however even such hack is better than not having anything (serious) and this is how it looks for me right now...

Perhaps I'll have a look at it again. After all, if we said --node=1-23/100 (for doing the first 23% of keyspace) maybe there's nothing really stopping us from re-doing the calculations for each length - but (IIRC) last time I checked it wasn't trivial to get a new keyspace figure at will. The thing that speaks against this happening any time soon is the fact I find hashcat superior very infrequently. Actually, nowadays I think it's solely the case of "single NT hash" and those are so very fast anyway it doesn't really matter what height within the exponential wall you're at - because it's, well, vertical. Believe it or not, there's ~not much~ virtually no difference between 100G p/s and 200G or even 800G p/s. Sure, that's 8x faster but you will NOT crack 8x more passwords. More like 0.8% more or so.

magnumripper avatar Mar 10 '21 22:03 magnumripper