DSInternals icon indicating copy to clipboard operation
DSInternals copied to clipboard

Bloomfilter for Test-PasswordQuality

Open PatchRequest opened this issue 3 years ago • 6 comments

For: Test-PasswordQuality

Instead of passing a 30GB file with all hashes a bloomfilter could be created from it and used to check against it. That would reduce the filesize to around 3GB and would be much faster and more efficent

I could implement such a feature would you be interested?

PatchRequest avatar Sep 15 '22 17:09 PatchRequest

Hi @PatchRequest , that sounds like a good idea! What would be the expected search time for 1 hash and 10K hashes when compared to the current binary search approach? What would the false positive rate be and should it be dealt with? What new paramater name of the Test-PasswordQuality cmdlet do you propose for this feature? -BlomFilterPath? And how would you like to name a cmdlet that would do the conversion? ConvertTo-BloomFilter?

MichaelGrafnetter avatar Sep 19 '22 13:09 MichaelGrafnetter

If time were an issue, I could see this being helpful as a sort of pre-filter. Bloom filter would be much faster to return not in set, and if it returned possibly in set, a follow up lookup in the larger database would drop FPR to zero. That's probably how I'd approach it in my use case. I'd be interested in testing out how much faster I could run this scenario against my dataset.

aseigler avatar Sep 19 '22 13:09 aseigler

I would use speed as a secondary argument i think size is more interesting because with a bloom filter the "bad password list" can fit on any usb stick with a false positive rate of 0.001%:

When benchmarking bloomfilters the nice think is they scale with O(1) while binary search is O(log N). Therefore the bigger the password list is the more efficent the bloom filter becomes. Which is a win-win situation

I think a parameter called -BlomFilterPath is a good idea, and the cmdlet for the creating it sounds good to. The only thing i would add is to provide an bloomfilter for haveibeenpwnd already with github lfs. So the bad password check is just a git clone -> Downloading 3 GB -> Lets go

PatchRequest avatar Sep 19 '22 15:09 PatchRequest

Sounds great. Regarding git lfs, I am a newbie here. Having issues with it, constantly getting download quota exceeded:

image

I used to store sample databases with git lfs, which was not a good idea. I am considering to do a cleanup and to upload my test ntds.dit files (several GBs) to Azure Blob Storage and to integrate their download into unit test runner.

MichaelGrafnetter avatar Sep 19 '22 17:09 MichaelGrafnetter

mmh an alternative could be to get it hosted somewhere else where there is no quota :/

PatchRequest avatar Sep 19 '22 17:09 PatchRequest

but anyways i will start to develop the features

PatchRequest avatar Sep 19 '22 17:09 PatchRequest