Rustam comments

Results 103 comments of


                                            Rustam

notepad-plus-plus revert "Update uchardet to 0.0.6 ..."

> Reverting the uchardet changes (#52) sounds like a bad idea anyway. I didn’t mean to cancel, but to try to improve on the basis of the knowledge gained :)...

notepad-plus-plus revert "Update uchardet to 0.0.6 ..."

Maybe this issue resolve #76 (see https://github.com/alberto-dev/notepad-plus-plus/commit/a504ebba54c41309f42006f8d82ecea435085731#diff-18d581d96114cd69e207975bf1c4fa43L249)

notepad-plus-plus revert "Update uchardet to 0.0.6 ..."

Take a look. Thus, new encoding detections were deleted (https://github.com/notepad-plus-plus/notepad-plus-plus/pull/5414/commits/9a39fafd335f2e1e5af4b5a3251c7cd961ee5fe9#diff-7c6715d4fafa723d6682f3b295c32875L82) This allowed us to discard cases when the same metrics arise (https://github.com/CharsetDetector/UTF-unknown/issues/77#issuecomment-573397518)

Add detection for encoding 'x-mac-romanian'

Before adding, you need to make sure that everything will be fine, https://github.com/CharsetDetector/UTF-unknown/issues/77#issuecomment-573397518

Add Detectors and Probers for target languages

Hello! I created a pr #63 for ease of understanding. In order to detect the encoding prober's objects are created. They are defined for multiple languages. With a small sample...

Add Detectors and Probers for target languages

It seems to me that first we need to try to single out single-byte probers by language, as models

Add Detectors and Probers for target languages

Hello, @304NotModified ! We can make breaking changes and override, using `internal`, everything that is in `src/Core`? This would make it easier to change the code.

Add Detectors and Probers for target languages

I think it would be nice if we could just change the source in `src/core` without thinking about breaking changes. That is, change the modifier from `public` to `internal`. I...

Instead of the encoding 'iso-8859-1' is detected 'iso-8859-15'

In the _Status Log_, the following metrics are the same: > SBCS 0.8360017: [iso-8859-15] > SBCS: 0.8360017 [iso-8859-15] > > SBCS 0.8360017: [iso-8859-1] > SBCS: 0.8360017 [iso-8859-1] > > SBCS...

Instead of the encoding 'iso-8859-1' is detected 'iso-8859-15'

As I understand it, in this case it is easier to get the same statistics https://en.wikipedia.org/wiki/ISO-8859-1#Similar_character_sets