dumper
dumper copied to clipboard
Improve support for non-UTF encodings
I tried to import a TXT-File that has "ANSI"-Encoding and the import does not import the file correctly.
I have an example-file here, that can be tested. The file is here in two versions:
- before-import.txt
- after-import.txt
The importer changes the "ANSI"-Encoding to "UTF8" but does not convert the special chars along the way. For example: Windows Notepad can convert this ansi-file without problems to utf8 without lossing the chars.
I think eventually we should use an encoding detector (maybe this) and then an encoding converter ( maybe this, but it's a native dependency and using it in the browser may be a problem).
For future reference:
- jschardet (detector): https://github.com/aadsm/jschardet
- iconv-lite (encoder/decoder): https://www.npmjs.com/package/iconv-lite