PassGAN icon indicating copy to clipboard operation
PassGAN copied to clipboard

Question about PassGAN implementation

Open iBM88 opened this issue 6 years ago • 6 comments

Hi, I have some questions:

  • Is the data encoded as onehot representation for each character of the password?
  • How do you enforce the generator to return onehot encoded characters to be fed into the discriminator?
  • How do you force the network to produce variable-length passwords?

Thanks

iBM88 avatar Sep 15 '18 12:09 iBM88

Greetings,

-Why passgan produces only passwords of a length <= 9 chars, how can adjust this parameter? -Why it generating not clean candidates, to be complatibliles with haschat?

Cevess mugan 26341909 ‡hawitrx aulemt28 æa旆1

Maybe you can clarify us! Thanks.

prodnet avatar Nov 24 '18 12:11 prodnet

Please reply.

prodnet avatar Nov 28 '18 10:11 prodnet

@iBM88 apologies for the late reply. I haven't taken a look at this code base in a while nor have I read the paper in > 1 year. But, here is a quick stab at some of your answers:

  1. Yes
  2. The network outputs probability distribution over all possible output characters and the most likely character is sampled. This is called greedy argmax sampling, and you can see it here.
  3. Put simply, you don't ;) The generator learns to produce passwords of varying length because the training data has passwords of different length. A knowledgeable spectator would recognize that this GAN implementation can't accept variable-length input or produce variable-length output. I get around that by padding the input with using the backtic character (`) which is stripped after password generation.

Password data looks like this at input and after output:

password``
hunter2```
god```````

The ` characters are stripped before they are print to stdout.

password
hunter2
god

For some reason, at least one user has experienced unexpected behavior related to this technique. I haven't investigated further to determine if it was user error or an issue with the repo itself.

brannondorsey avatar Nov 29 '18 02:11 brannondorsey

@prodnet The maximum password length can be set using the --seq-length command-line argument in train.py. As for the reason the generator is outputting "not-clean" output, I would imagine that may be that the training data has similar "not clean" characters/bytes. It's been a while since I've familiarized myself with this codebase, so this is only speculation.

brannondorsey avatar Nov 29 '18 02:11 brannondorsey

I can't reproduce your results using the model I trained, and the cover rate is about 4%. So, I wonder whether you can stably achieve almost 20% cover rate every time?

xiaozhouguo94 avatar Sep 12 '20 14:09 xiaozhouguo94

code is executed. but output file(generated_pass.txt) is not created. in which file we are writing code for creating file( generated_pass.txt)

please help me with this problem.

vamsijay11 avatar Jan 05 '22 06:01 vamsijay11