UTF-unknown icon indicating copy to clipboard operation
UTF-unknown copied to clipboard

Add Detectors and Probers for target languages

Open rstm-sf opened this issue 6 years ago • 7 comments

Hello!

It may be worth adding the ability to determine the encoding if you know which target language?

rstm-sf avatar Feb 23 '19 22:02 rstm-sf

Hi,

Sorry for the late response.

What do you mean with this?

304NotModified avatar Aug 24 '19 12:08 304NotModified

Hello!

I created a pr #63 for ease of understanding.

In order to detect the encoding prober's objects are created. They are defined for multiple languages. With a small sample of characters to detect the encoding, conflicts may arise between the encodings due to the possibility of being a character code in different languages.

But, what if we need to define an encoding, knowing that it can belong to only one language? Then you can restrict yourself to probers only for a given language, reducing the likelihood of incorrect detections.

PS. Sorry for my english.

rstm-sf avatar Aug 31 '19 19:08 rstm-sf

sound good, but now sure how easy it is to change that is this code base.

304NotModified avatar Sep 21 '19 22:09 304NotModified

It seems to me that first we need to try to single out single-byte probers by language, as models

rstm-sf avatar Nov 09 '19 11:11 rstm-sf

Hello, @304NotModified !

We can make breaking changes and override, using internal, everything that is in src/Core? This would make it easier to change the code.

rstm-sf avatar Feb 24 '20 07:02 rstm-sf

do you mean if making breaking changes in src/core is OK? I think it is. We should make them internal also

304NotModified avatar Feb 24 '20 16:02 304NotModified

I think it would be nice if we could just change the source in src/core without thinking about breaking changes. That is, change the modifier from public to internal.

I just have the idea of separating probers as models into languages (however, it will take a lot of time, there are about 100 of them). And it would be nice then to change the namespace

rstm-sf avatar Feb 24 '20 17:02 rstm-sf