jschardet icon indicating copy to clipboard operation
jschardet copied to clipboard

Character encoding auto-detection in JavaScript (port of python's chardet)

Results 40 jschardet issues
Sort by recently updated
recently updated
newest added

It does not work with Romanian subtitle files. OpenSubtitles detects these files as "cp1250", `jschardet` detects the encoding as "windows-1252". Wrong characters: ã þ º Correct romanian special characters: ă...

The following file detects as EUC-JP even though it is not. Seems to be caused by a single `ü` inside that file. File: [QuietLight.tmTheme.txt](https://github.com/aadsm/jschardet/files/879178/QuietLight.tmTheme.txt)

does it have to read the whole buffer? https://github.com/aadsm/jschardet/blob/master/src/init.js#L72-L85 if it has to read the whole buffer, can it return a string as a result? i want to check if...

detect this rss site http://news.baidu.com/n?cmd=1&class=civilnews&tn=rss error ``` { encoding: null, confidence: 0 } ```

It would be great if encoding detection could accept stream as input, trying to detect the encoding by block, and returning a result as soon as a minimum confidence level...

Hi, What does it take to create the statistical model to support win-1256 code pages? Thanks

`const buffer = fs.readFileSync(path);` `const encodingResult = jschardet.detect(buffer);` `console.log(encodingResult.encoding);`

The string `~{` in a UTF8/ASCII document causes detection to fail with `{ encoding: null, confidence: 0 }`.

Got an undefined variable error for `denormalizedEncodings`

请问这是跟webpack 的配置有关吗?怎么配置支持? ![image](https://github.com/aadsm/jschardet/assets/46514506/5349ac53-2900-450e-96e2-b6a20e31fa5c)