4chanCaptchaSolver icon indicating copy to clipboard operation
4chanCaptchaSolver copied to clipboard

No longer functioning after changes to captcha formatting

Open Alighieri99G opened this issue 1 year ago • 15 comments

Seems like there have been some changes to the way that the captcha displays, now the solver doesn't work properly

Alighieri99G avatar Aug 06 '23 09:08 Alighieri99G

Yes it indeed dosnt work Anyone know any work arounds?

KurubaEX avatar Aug 11 '23 05:08 KurubaEX

Sorry, this is way outside of my wheelhouse

Alighieri99G avatar Aug 11 '23 05:08 Alighieri99G

@K1rakishou https://git.coom.tech/araragi/JKCS

aicynide avatar Aug 13 '23 23:08 aicynide

Yeah, I don't know. I tried to use the new model but I can't make it work (I probably fucked up somewhere and I have no idea where).

If you have any idea then you can take a look at this function - https://github.com/K1rakishou/4chanCaptchaSolver/blob/master/app/src/main/java/com/github/k1rakishou/chan4captchasolver/Solver.kt#L35

In the script they do some weird voodoo shit and there are no equivalent functions in Android TFLite.

  const filtered2 = tf.tensor3d(mono, [image.height, image.width, 1]);
  const prediction = model.predict(filtered2.transpose([1, 0, 2]).expandDims(0));

And also this

const greedyCTCDecode = (yPred: tf.Tensor<tf.Rank>) => tf.tidy(() => yPred.argMax(-1).arraySync());

I tried to do those conversions manually with ChatGPT's help but the results are clearly wrong. (It predicts YY for the currently hardcoded JXAPXW captcha).

Other than that I have updated the image sliding algorithm and it works. The only thing that blocks me right now are those two conversion functions (I think).

Maybe the code is correct but I have fucked up when converting the model from h5 format into tflite format (again, did that with the help of ChatGPT).

K1rakishou avatar Aug 15 '23 15:08 K1rakishou

Saw this on /g/, it might be helpful: https://boards.4chan.org/g/thread/95322117#p95390905

bronkeye avatar Aug 16 '23 11:08 bronkeye

In the script they do some weird voodoo shit That's a standard HWC to WHC conversion of the input tensor, then it adds batch dimension so the final tensor is BWHC

And also this Thats output lebel decoder, because of CTC loss used during the model training - you need it. There's reference implementation in java: https://www.tensorflow.org/jvm/api_docs/java/org/tensorflow/op/nn/CtcGreedyDecoder

aicynide avatar Aug 18 '23 17:08 aicynide

You can also use this method to embed CTC decoder in your tfile: https://stackoverflow.com/questions/74762668

aicynide avatar Aug 18 '23 17:08 aicynide

I have already tried all of the CTC decoder implementations that I could find and none of them helped.

K1rakishou avatar Aug 18 '23 18:08 K1rakishou

voodoo shit

Maybe instead of asking ChatGPT to do the work for you, you could have just asked it how it works.

https://github.com/K1rakishou/4chanCaptchaSolver/blob/master/app/src/main/java/com/github/k1rakishou/chan4captchasolver/Solver.kt#L114

Reshaping isn't a transpose, the input image is supposed to be transposed. Have you tried visualizing what the result looks like? It's probably scrambled. A proper transpose would look like it was rotated 90 degrees then mirrored (vertically if the rotation was CCW, else horizontally).

For this input HSXVW

A transpose would look like this image

But reshaping makes it look like this

image

(ignore the labels, it's just chatgpt being retarded)

coomdev avatar Aug 19 '23 21:08 coomdev

@coomdev Ohh, so it had to be rotated and mirrored just like in the previous version. I see. Yeah, this was not obvious to me at all even after reading your code, jupyter notebook and chatgpt's explanations. I just looked at the input of the model in netron and it said that it's 1x300x80 so to me it was obvious that I don't need to rotate it in any way. It works now, thanks!

K1rakishou avatar Aug 20 '23 03:08 K1rakishou

https://github.com/K1rakishou/4chanCaptchaSolver/blob/0240aefe9a60fd4b86644a28389168f3b5252bbd/app/src/main/java/com/github/k1rakishou/chan4captchasolver/Helpers.kt#L210

@K1rakishou Here, if the input image isn't 300px wide, it shouldn't be drawn in the center of the canvas but stretched to take the available space. (the model was trained that way)

coomdev avatar Aug 20 '23 08:08 coomdev

@coomdev I don't get it. If I draw foreground image stretched to 300px then it won't be aligned with the background image anymore. Do you mean that I need to stretch the resulting image after combining bg + fg? If yes, then I'm already doing it here: https://github.com/K1rakishou/4chanCaptchaSolver/blob/0240aefe9a60fd4b86644a28389168f3b5252bbd/app/src/main/java/com/github/k1rakishou/chan4captchasolver/Helpers.kt#L162

The problem right now is that sometimes after combining both images there are some big groups of black pixels left on the sides which are processed by the model and it sometimes sees characters in them. So from my understanding I need to somehow remove them. Here is an example: image

K1rakishou avatar Aug 21 '23 11:08 K1rakishou

Nevermind I figured it out. This https://github.com/K1rakishou/4chanCaptchaSolver/blob/0240aefe9a60fd4b86644a28389168f3b5252bbd/app/src/main/java/com/github/k1rakishou/chan4captchasolver/Helpers.kt#L133 canvas was using an incorrect width (300 instead of width of the smallest of the images). That's why there were garbage pixels drawn on it. Now it works.

K1rakishou avatar Aug 21 '23 11:08 K1rakishou

If yes, then I'm already doing it here

Ah, nevermind then, I misinterpreted the code.

The way we preprocessed the images was to make them display exactly as show on 4chan's captcha: the foreground image is the canvas, the background image is not visible behind, and that is then stretched to 300.

coomdev avatar Aug 21 '23 11:08 coomdev

Alright, hopefully this is the last set of bugfixes. Thanks for your help.

K1rakishou avatar Aug 21 '23 12:08 K1rakishou