chan icon indicating copy to clipboard operation
chan copied to clipboard

Need more solved captchas?

Open K1rakishou opened this issue 1 year ago • 17 comments

Hello. I noticed that you are collecting solved captchas and I'm assuming that's to train a NN model for future use. If I'm correct and you are planning to make it public then I could contribute you solved captchas to make the process faster (the same way you are doing it). I'm currently implementing captcha solver based on https://github.com/AUTOMATIC1111/4chan-captcha-solver model but it's not ideal.

K1rakishou avatar Aug 14 '22 14:08 K1rakishou

Hi @K1rakishou I tried using a neural network at the beginning, I made a program to generate fake captchas, but even with a data set of 100,000, I couldn't come up with a working model that would solve those generated captchas. So I gave up on neural networks, I'm not really an expert in that technique. I made the solver using classical techniques, so more data doesn't really help that much, and it is has kind of reached a plateau at ~65% accuracy.

I just rechecked https://github.com/AUTOMATIC1111/4chan-captcha-solver, from a first glance, it seems good and more accurate than my solver. What makes it not ideal? Too big for mobile devices?

By the way, I have ~34,000 captchas at the moment, let me know if it would help you in your effort, and I can provide them.

moffatman avatar Aug 14 '22 19:08 moffatman

Oh I see. Well I'm not an ML expert either, so more captchas won't help me.

This https://github.com/AUTOMATIC1111/4chan-captcha-solver model solves like 80-90% of captchas (I didn't measure it, just a feeling after using it for a couple of weeks) but the slider auto adjustment algorithm sometimes fails as well (which obviously makes the model fail too). You can manually adjust the slider and then solve it again and it works but it requires the user's input.

Too big for mobile devices?

The model itself is like 8 megabytes (tensorflow lite format, you can also enable optimizations when converting it so it will only take 2 megabytes but I don't know how it affects accuracy) however it uses some custom layers (?) and the default tensorflow lite does not support them so I had to include some special dependency with more layer implementations and it's size ended up being like ~40 megabytes just for one cpu architecture. I decided to extract that stuff into a separate apk and I use broadcast receivers to communicate between the main app and the solver app. https://github.com/K1rakishou/4chanCaptchaSolver

K1rakishou avatar Aug 14 '22 21:08 K1rakishou

the slider auto adjustment algorithm sometimes fails as well

You should try my slider algorithm here, I have only ever seen one failure in many months of use, and I think it's more efficient than the one in that repo as well. https://github.com/moffatman/chan/blob/b5ef7ef206584395e1151dee6060e50aa5f60619/lib/widgets/captcha_4chan.dart#L348 Basically, only compare the pixels of the image on the edge of the background cutouts.

How long does it take to execute the neural model on mobile?

moffatman avatar Aug 15 '22 01:08 moffatman

You should try my slider algorithm here

I will take a look at it but I think it's going to be very hard because the script is doing some additional captcha image transformations (it flips the pixels horizontally, rotates them and then draws everything in the middle of the canvas which also has specifically calculated size + adds some 0xeeeeee colored border around the image).

How long does it take to execute the neural model on mobile?

With no gpu and other kinds of accelerations (only cpu with 4 threads) and on an emulator in debug build it takes ~200ms. The slider auto adjustment however takes like 600-700ms. But all together it's less than a second which is pretty good. Guess it should be slightly better once the apk is built in release mode.

K1rakishou avatar Aug 15 '22 07:08 K1rakishou