UTF-unknown icon indicating copy to clipboard operation
UTF-unknown copied to clipboard

UTF-8 file is detected as Windows-1252 (western) (SBCSCodePageEncoding)

Open michaeleohou opened this issue 1 year ago • 2 comments

Input: Text.txt Text: "ND Driver’s License DOE111111"

Output encoding: System.Text.SBCSCodePageEncoding Expected encoding: UTF8

michaeleohou avatar Jul 24 '24 04:07 michaeleohou

related https://github.com/CharsetDetector/UTF-unknown/issues/168

304NotModified avatar Aug 06 '25 22:08 304NotModified

details

Image

tested it, 1252 is indead wrong

304NotModified avatar Aug 08 '25 22:08 304NotModified