rexgen
rexgen copied to clipboard
Adding to librexgen the ability to resume from abort or crash
Did you consider to add to librexgen something that will e.g. drop first n generated lines?
IMHO things like that should be done with the next program in the pipe while rexgen should concentrate on it's purpose and do that good. It's not that I'm too worried about bloating, but performance. Rexgen's biggest challenge is speed.
Whatever works, it's fine with me. Today using librexgen for short lists is great, e.g. if you remember part of your password than you could generate candidates using regex, but starting from scratch when you use a dictionary and multiply it by using regex is hell.
Oh, right. Sorry, I did not see the issue title! Yes, resuming (which is of course a variant of skipping) is a good thing of course. Especially when used in JtR (all our other modes can resume).
The speed is a challenge too. $ time ./john --stdout --wordlist=password.lst --rules=wordlist |tail Press 'q' or Ctrl-C to abort, almost any other key for status 156846p 0:00:00:00 100.00% (2015-05-12 02:06) 560164p/s Sssing Tuling Unixing Xanthing Qwerting Alling Dirking Newcourting Niting Notuseding Sssing
real 0m0.295s user 0m0.228s sys 0m0.000s $ time ./john --stdout --wordlist=password.lst --rules=wordlist --regex="\0" |tail Press 'q' or Ctrl-C to abort, almost any other key for status 156846p 0:00:00:01 100.00% (2015-05-12 02:07) 87623p/s Sssing Tuling Unixing Xanthing Qwerting Alling Dirking Newcourting Niting Notuseding Sssing
real 0m1.849s user 0m1.292s sys 0m0.448s $ time ./john --stdout --wordlist=password.lst --rules=wordlist --regex="\0(y|n)" |tail Press 'q' or Ctrl-C to abort, almost any other key for status 313692p 0:00:00:03 100.00% (2015-05-12 02:10) 84325p/s Sssingn Dirkingy Dirkingn Newcourtingy Newcourtingn Nitingy Nitingn Notusedingy Notusedingn Sssingy Sssingn
real 0m3.946s user 0m2.512s sys 0m1.228s
skipping the first n lines is a really hard problem, but resuming after a crash is possible, and makes sense to me. We already have the code to suspend/resume in librexgen, which is currently not used by rexgen.
There is only one limitation: when rexgen reads from a stream, which is not seekable, we won't be able to store the state of StreamRegexIterator.
How this suspend/resume code works in librexgen? In general resuming work after abort or crash will require remembering the last regular expression and last generated line or line number or inner state of librexgen generator. For high values of n to skip the outer program must wait for librexgen to generate n-th line (or n+1), so it could be done, but it's far from perfect. Rexgen and librexgen should generate as fast as possible, so better approach would be sending last regular expression (e.g. transformed word from wordlist) and then setting the state of generator in some way and at last starting the generator. The generator should then only generate words as fast as it can. The key aspect would be the function that will translate the number of processed lines for given regular expression into state of generator or will translate last processed line for given regular expression into state of the generator but this approach would require sending one regular expression at a time and some feedback from librexgen to send another regular expression. Starting from last regular expression is still faster then starting from first line of wordlist. I'm just thinking aloud ;-)