`guess_assess()` can’t return `false` for second return value
Apologies if I am missing something obvious here, but I was doing a bunch of testing with chardetng and noticed that the second return value from EncodingDetector.guess_assess() (the boolean indicating whether the guess was any good) never seems to be false. Looking at the source, it actually seems like it’s not possible:
-
The
maxvariable that tracks the best score starts at 0: https://github.com/hsivonen/chardetng/blob/143dadde20e283a46ef33ba960b517a3283a3d22/src/lib.rs#L3003 -
It only ever gets updated with scores that are greater than the current value (so: always >= 0): https://github.com/hsivonen/chardetng/blob/143dadde20e283a46ef33ba960b517a3283a3d22/src/lib.rs#L3043-L3048
-
The final boolean is just whether
maxis >= 0, which it always is, per the above points: https://github.com/hsivonen/chardetng/blob/143dadde20e283a46ef33ba960b517a3283a3d22/src/lib.rs#L3062I think this should probably be
max > 0, which would mean you’d get a false if there were no better guesses than the default encoding for the TLD (or if the only good guess is ISO-8859-8? That’s a bit odd…). I’m not familiar enough with the internals here to know for sure what the right thing would be.
It looks like max used to start with a negative value, so maybe that’s how this issue came to be? (That said, it changed in 0d26e7e8432b46ad13cefa17e9124f1e61efb91e, which was before guess_assess() existed. 🤷)
https://github.com/hsivonen/chardetng/blob/f15d0f84790a6f72316c21dac9a17bf374e13b92/src/lib.rs#L1734
Again, apologies if I’ve missed something obvious here and this isn’t a real issue — I don’t have a lot of experience with Rust. But this does seem to match up with what I’ve seen so far throwing lots of different data at chardetng and never seeing a false result.
@P-E-Meunier , you added guess_assess. Does it work for you as intended?