dumper icon indicating copy to clipboard operation
dumper copied to clipboard

Improve support for non-UTF encodings

Open funkrusher opened this issue 6 years ago • 2 comments

I tried to import a TXT-File that has "ANSI"-Encoding and the import does not import the file correctly.

I have an example-file here, that can be tested. The file is here in two versions:

  1. before-import.txt
  2. after-import.txt

The importer changes the "ANSI"-Encoding to "UTF8" but does not convert the special chars along the way. For example: Windows Notepad can convert this ansi-file without problems to utf8 without lossing the chars.

before-import.txt after-import.txt

funkrusher avatar Dec 09 '19 20:12 funkrusher

I think eventually we should use an encoding detector (maybe this) and then an encoding converter ( maybe this, but it's a native dependency and using it in the browser may be a problem).

fabiospampinato avatar Jan 15 '20 20:01 fabiospampinato

For future reference:

  • jschardet (detector): https://github.com/aadsm/jschardet
  • iconv-lite (encoder/decoder): https://www.npmjs.com/package/iconv-lite

fabiospampinato avatar Jan 30 '20 19:01 fabiospampinato