captcha
captcha copied to clipboard
Penetration testing
It would be helpful to know how effective machine learning is at solving the captchas generated by this library so that applications can know how many subsequent tries to allow and the library can use real data to improve its security and keep up to date with the arms race.
I can see this testing being something along these lines:
- Script generates a data set of 1,000,000 images and their respective chars from captchas of a specific difficulty.
- Machine learning algorithm is trained on these images and their solutions.
- Script generates 100 or so new captchas and measures how many tries it takes to solve each one.
Generally I'd see good security as meaning a bot has a less than 50% chance of guessing with 3 attempts on medium difficulty.
I've seen an article related to this for reference: https://towardsai.net/p/deep-learning/deep-learning-based-automatic-captcha-solver