csvtotable icon indicating copy to clipboard operation
csvtotable copied to clipboard

getting started error

Open darenr opened this issue 6 years ago • 5 comments

In getting started (README) it says csvtohtml and I think you meant csvtotable

On a separate issue, first CSV I tried it on I get a char encoding issue. The CSV is not public so I can't attach it, but I'm sure you can find some that contain utf8 chars

Traceback (most recent call last):
  File "/usr/local/bin/csvtotable", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/cli.py", line 30, in cli
    delimiter=delimiter, quotechar=quotechar)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/convert.py", line 48, in convert
    for row in reader:
  File "/usr/local/lib/python2.7/dist-packages/backports/csv.py", line 394, in __next__
    lineobj = next(self.input_iter)
  File "/usr/lib/python2.7/codecs.py", line 314, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf4 in position 818: invalid continuation byte

darenr avatar Jul 06 '17 01:07 darenr

Thanks for reporting the issue. Can you please give more details like which Python version you are using and the OS. I have tested CSV data with utf8 chars in both Python 2.7.10, 3.4 and 3.5 and it seems fine.

vividvilla avatar Jul 06 '17 03:07 vividvilla

python 2.7.13 on Ubuntu 16.10. I will try to extract a few lines of the cvs that reproduce the problem. In my experience character encoding issues are 49% of all python bugs, 49% are off-by-one and the remaining 2% all others :)

darenr avatar Jul 06 '17 06:07 darenr

Indeed :) character encoding issues are pain in the ass. I have replaced dependency backports.csv with unicodecsv which handles UTF-8 encoded CSV data better. Try upgrading the package to v1.1.0 and check if issue is fixed.

vividvilla avatar Jul 06 '17 18:07 vividvilla

Successfully installed csvtotable-1.1.0

Unfortunately:

drace@drace:~/Desktop$ csvtotable Worker_Salary_0224.csv worker.html
File (worker.html) already exists. Do you want to overwrite? (y/n): 
Traceback (most recent call last):
  File "/usr/local/bin/csvtotable", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/cli.py", line 30, in cli
    delimiter=delimiter, quotechar=quotechar)
  File "/usr/local/lib/python2.7/dist-packages/csvtotable/convert.py", line 54, in convert
    for row in reader:
  File "/usr/local/lib/python2.7/dist-packages/unicodecsv/py2.py", line 128, in next
    for value in row]
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf4 in position 6: invalid continuation byte

The reason might be this:

drace@drace:~/Desktop$ file Worker_Salary_0224.csv 
Worker_Salary_0224.csv: ISO-8859 text, with very long lines, with CRLF line terminators

Linux reports that the file is in ISO-8859 encoding (not utf-8)

I can anonymize the data if that's useful, but you should be able to take a csv file you have and use iconv to convert it's encoding.

darenr avatar Jul 06 '17 18:07 darenr

try this:

pyexcel transcode --csv-source-encoding iso-8859-1 Worker_Salary_0224.csv Worker_Salary.sortable.html

and you will need:

$ pip install pyexcel pyexcel-cli pyexcel-sortable

And pyexcel-sortable wraps csvtotable.

chfw avatar Jul 14 '17 19:07 chfw