chardetng icon indicating copy to clipboard operation
chardetng copied to clipboard

`guess_assess()` can’t return `false` for second return value

Open Mr0grog opened this issue 2 years ago • 1 comments

Apologies if I am missing something obvious here, but I was doing a bunch of testing with chardetng and noticed that the second return value from EncodingDetector.guess_assess() (the boolean indicating whether the guess was any good) never seems to be false. Looking at the source, it actually seems like it’s not possible:

  • The max variable that tracks the best score starts at 0: https://github.com/hsivonen/chardetng/blob/143dadde20e283a46ef33ba960b517a3283a3d22/src/lib.rs#L3003

  • It only ever gets updated with scores that are greater than the current value (so: always >= 0): https://github.com/hsivonen/chardetng/blob/143dadde20e283a46ef33ba960b517a3283a3d22/src/lib.rs#L3043-L3048

  • The final boolean is just whether max is >= 0, which it always is, per the above points: https://github.com/hsivonen/chardetng/blob/143dadde20e283a46ef33ba960b517a3283a3d22/src/lib.rs#L3062

    I think this should probably be max > 0, which would mean you’d get a false if there were no better guesses than the default encoding for the TLD (or if the only good guess is ISO-8859-8? That’s a bit odd…). I’m not familiar enough with the internals here to know for sure what the right thing would be.

It looks like max used to start with a negative value, so maybe that’s how this issue came to be? (That said, it changed in 0d26e7e8432b46ad13cefa17e9124f1e61efb91e, which was before guess_assess() existed. 🤷) https://github.com/hsivonen/chardetng/blob/f15d0f84790a6f72316c21dac9a17bf374e13b92/src/lib.rs#L1734

Again, apologies if I’ve missed something obvious here and this isn’t a real issue — I don’t have a lot of experience with Rust. But this does seem to match up with what I’ve seen so far throwing lots of different data at chardetng and never seeing a false result.

Mr0grog avatar Nov 03 '23 09:11 Mr0grog

@P-E-Meunier , you added guess_assess. Does it work for you as intended?

hsivonen avatar Aug 05 '25 09:08 hsivonen