reverse_geocode icon indicating copy to clipboard operation
reverse_geocode copied to clipboard

specify encoding explicitly in case it isn't utf-8 (e.g. on windows)

Open blushingpenguin opened this issue 4 years ago • 7 comments

on windows this currently fails with a trace like:

  File "c:\utils\Python\Python37\lib\site-packages\reverse_geocode\__init__.py", line 121, in search
    gd = GeocodeData()
  File "c:\utils\Python\Python37\lib\site-packages\reverse_geocode\__init__.py", line 25, in getinstance
    instances[cls] = cls()
  File "c:\utils\Python\Python37\lib\site-packages\reverse_geocode\__init__.py", line 34, in __init__
    coordinates, self.locations = self.extract(rel_path(geocode_filename))
  File "c:\utils\Python\Python37\lib\site-packages\reverse_geocode\__init__.py", line 100, in extract
    for latitude, longitude, country_code, city in rows:
  File "c:\utils\Python\Python37\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 464: character maps to <undefined>

because the file system encoding defaults to cp1252 rather than utf-8, and geocode.csv is in utf-8. Specifying the encoding works around this (and obscure cases on linux where somebody has set LC_TYPE to something other than utf-8).

blushingpenguin avatar Jan 28 '21 16:01 blushingpenguin

encoding is not a supported argument in python2 and I want to maintain backwards compatibility for now - are you able to provide a version that works for both?

richardpenman avatar Feb 08 '21 02:02 richardpenman

from io import open at the top of the file would allow this package to still be backwards compatible with python2

image

Source: https://stackoverflow.com/questions/491921/unicode-utf-8-reading-and-writing-to-files-in-python

Any chance for this pr getting merged @richardpenman?

/cc @blushingpenguin

xnetcat avatar Jun 26 '21 22:06 xnetcat

good find @xnetcat - I tested adding this import however found the csv module did not play well with unicode. Do you have a branch that got this working?

richardpenman avatar Jun 28 '21 22:06 richardpenman

Nope, I've used blushingpenguin branch and it worked just fine - python3.8 windows10

xnetcat avatar Jun 28 '21 23:06 xnetcat

Was hitting the same error on Ubuntu with python 3.6.9. Setting the encoding to be explicit fixed the issue.

Given that almost half of the issues are due to the same error, it seems like this issue affects more people than python 2 compatibility may.

gergelycsegzi avatar Feb 08 '22 15:02 gergelycsegzi

I'll merge a PR if it maintains support for Python 2

richardpenman avatar Feb 09 '22 01:02 richardpenman

I'm facing this problem as well in windows CI: https://github.com/pypsa-meets-africa/pypsa-africa/runs/6512487126?check_suite_focus=true

  File "C:\Miniconda3\envs\pypsa-africa\lib\site-packages\reverse_geocode\__init__.py", line 121, in search
    gd = GeocodeData()
  File "C:\Miniconda3\envs\pypsa-africa\lib\site-packages\reverse_geocode\__init__.py", line 25, in getinstance
    instances[cls] = cls()
  File "C:\Miniconda3\envs\pypsa-africa\lib\site-packages\reverse_geocode\__init__.py", line 34, in __init__
    coordinates, self.locations = self.extract(rel_path(geocode_filename))
  File "C:\Miniconda3\envs\pypsa-africa\lib\site-packages\reverse_geocode\__init__.py", line 100, in extract
    for latitude, longitude, country_code, city in rows:
  File "C:\Miniconda3\envs\pypsa-africa\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 464: character maps to <undefined>

davide-f avatar May 20 '22 13:05 davide-f

Have added UTF8 encoding

richardpenman avatar Jun 06 '24 22:06 richardpenman