csvkit icon indicating copy to clipboard operation
csvkit copied to clipboard

UTF8 issue on Windows 10

Open j2l opened this issue 7 years ago • 2 comments

Full error is: 'charmap' codec can't encode character '\u0107' in position 86: character maps to <undefined>

j2l avatar Mar 21 '18 12:03 j2l

Also tested with test_utf8.csv file:

> csvcut -c baz .\test_utf8.csv > few.csv
'charmap' codec can't encode character '\u02a4' in position 0: character maps to <undefined>
> csvcut -e utf-8-sig -c baz .\test_utf8.csv > few.csv
'charmap' codec can't encode character '\u02a4' in position 0: character maps to <undefined>
> csvcut -e utf-8 -c baz .\test_utf8.csv > few.csv
'charmap' codec can't encode character '\u02a4' in position 0: character maps to <undefined>
> csvcut -e utf8 -c baz .\test_utf8.csv > few.csv
'charmap' codec can't encode character '\u02a4' in position 0: character maps to <undefined>

j2l avatar Mar 22 '18 11:03 j2l

test_utf8 like my file, are missing BOM. Therefore CMD or PowerShell interpret them as ascii. I thought that opening the file with an hex editor and pasting  at the beginning, does the trick ... but no, still interpreted as ascii. May be related to https://github.com/ipython/ipython/issues/10011

j2l avatar Mar 22 '18 12:03 j2l

Closing as some Windows issues have been fixed in csvkit and in newer Python versions, and the Python and csvkit versions are not reported in the issue, so there is no reproducible example.

jpmckinney avatar Oct 17 '23 19:10 jpmckinney