Windows support
The module does not support the Windows OS, it would be great to add such a feature to make the project OS independent.
State: Work in progress... ⌛
Hi, the reason the project does not support Windows is that there is a library (hyperscan) that is essential for the scanner but that is not supported by Windows systems.
From the home page of python-hyperscan:
A CPython extension for Hyperscan, Intel's open source, high-performance multiple regex matching library.
Currently only supports manylinux-compatible Linux distributions.
That's why we decided to provide docker containers: this way it is possible to use Credential Digger (with the UI) also from Windows.
Hi @marcorosa, You are totally right about the Hyperscan part. I agree with you as well on using Docker to get around this limitation if the users just want to use the tool as is; however, this leaves Windows users with a tool that is not capable of interacting with anything that is on the host machine (Windows in this case). For example, we cannot integrate the project within any CI/CD cycle on windows for it being unable to be executed or communicated with (directly) on that OS.
My solution, which is in progress(+), targets the core of the project (the GitScanner module) with the hope to deliver an OS independent python module, taking us one step closer to plug-and-play.
So, if I understand correctly, are you trying to replace these lines?
Hyperscan was the best performing library we tested, I hope that performance will stay reasonably close also using another library. In the meantime, I stay tuned for your PR ;)
I agree, Hyperscan is the best when it comes to performance for our usecase and I believe that removing it from the project is not a good choice. That said, I am planning on breaking the GitScanner module into three files: . └── GitScanner/ ├── GitScanner_Linux / │ └── (that uses Hyperscan) └── GitScanner_Windows/ │ └── (that uses regex)
This can be achieved by performing a platform test to check whether we are on Windows or not. Using this test we can decide what class to use + what library to load. As for the dependencies, we will install Hyperscan on Linux and Regex on Windows.
These following changes will be included within the requirements.txt file:
hyperscan; platform_system != "Windows"
regex; platform_system == "Windows"
In other terms, Linux users will not experience any performance loss, whereas Windows users will now have access to the tool (with a slight performance loss as a result of not using Hyperscan).
It is somewhat similar to what we've done with Sqlite & postgres clients.