geotext icon indicating copy to clipboard operation
geotext copied to clipboard

UnicodeDecodeError with Python 3 on Window

Open shinstar123 opened this issue 8 years ago • 21 comments

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 165: character maps to

shinstar123 avatar Jan 15 '17 03:01 shinstar123

What was the text fragment that triggered the error?

elyase avatar Jan 15 '17 19:01 elyase

i just ran this example from geotext import GeoText

places = GeoText("London is a great city") places.cities

GeoText('New York, Texas, and also China').country_mentions

and that issue comeout

shinstar123 avatar Jan 18 '17 14:01 shinstar123

Unfortunately I can't reproduce the issue, can you try installing in a fresh environment?

elyase avatar Jan 24 '17 17:01 elyase

My fix for someone else that had this problem. Run with error: http://pastebin.com/d0N7Q9cZ Fix: with open(filename, 'r') as f: To: with open(filename, 'r', encoding='utf-8') as f: Test:

(geo_test) C:\Python36\geo_test
λ python
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 07:18:10) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from geotext import GeoText
>>> places = GeoText("London is a great city")
>>> places.cities
['London']
 
>>> GeoText('New York, Texas, and also China').country_mentions
OrderedDict([('US', 2), ('CN', 1)])
 
>>> places = GeoText("Oslo is a great city")
>>> places.cities
['Oslo']

snippsat avatar Mar 21 '17 22:03 snippsat

Error still seen when installing and running on Python 3.4

import geotext
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm 2017.2.4\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Python34\lib\site-packages\geotext\__init__.py", line 7, in <module>
    from .geotext import GeoText
  File "C:\Program Files\JetBrains\PyCharm 2017.2.4\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Python34\lib\site-packages\geotext\geotext.py", line 87, in <module>
    class GeoText(object):
  File "C:\Python34\lib\site-packages\geotext\geotext.py", line 103, in GeoText
    index = build_index()
  File "C:\Python34\lib\site-packages\geotext\geotext.py", line 77, in build_index
    cities = read_table(get_data_path('cities15000.txt'), usecols=[1, 8])
  File "C:\Python34\lib\site-packages\geotext\geotext.py", line 54, in read_table
    for line in lines:
  File "C:\Python34\lib\site-packages\geotext\geotext.py", line 51, in <genexpr>
    lines = (line for line in f if not line.startswith(comment))
  File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 165: character maps to <undefined>```

iShekhar avatar Nov 27 '17 14:11 iShekhar

I'm getting the same error just trying to import geotext, Python 3.6 on Windows 10 with Anaconda. Specifically this error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 165: character maps to <undefined> when the IncrementalDecoder tries to open the cities csv.

dovinmu avatar Apr 19 '18 17:04 dovinmu

Fixed the problem by using Linux instead of Windows.

dovinmu avatar Apr 19 '18 17:04 dovinmu

The question is to solve the issue on WINDOWS!!

iShekhar avatar Apr 20 '18 08:04 iShekhar

I may have been being slightly snarky.

dovinmu avatar Apr 20 '18 16:04 dovinmu

Can someone on Windows try again on master:

pip install https://github.com/elyase/geotext/archive/master.zip

?

elyase avatar Jun 13 '18 20:06 elyase

Tried it, didn't solve the problem snippsat's solution worked for me

Ala1s avatar Jul 08 '18 17:07 Ala1s

Having the same problem during the module import

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 165: character maps to <undefined>

tschlach avatar Jul 16 '18 16:07 tschlach

@tschlach and snippsats solution did not work?

iwpnd avatar Jul 16 '18 16:07 iwpnd

@iwpnd It seems like snippsats solution suggests that the UnicodeDecodeError results from reading a text file without specifying the encoding.

I think the error that most of us are encounter comes on importing the library.

Here's my complete traceback of the error:

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import geotext
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\site-packages\geotext\__init__.py", line 7, in <module>
    from .geotext import GeoText
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\site-packages\geotext\geotext.py", line 87, in <module>
    class GeoText(object):
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\site-packages\geotext\geotext.py", line 103, in GeoText
    index = build_index()
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\site-packages\geotext\geotext.py", line 77, in build_index
    cities = read_table(get_data_path('cities15000.txt'), usecols=[1, 8])
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\site-packages\geotext\geotext.py", line 54, in read_table
    for line in lines:
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\site-packages\geotext\geotext.py", line 51, in <genexpr>
    lines = (line for line in f if not line.startswith(comment))
  File "C:\Users\tschlachter\AppData\Local\Continuum\Miniconda3\envs\language\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 165: character maps to <undefined>

tschlach avatar Jul 17 '18 16:07 tschlach

@tschlach Yes, and if you look closely you see that read_table() is executed which is reading a text.

@elyase I just created a clean venv and installed geotext via pip install https://github.com/elyase/geotext/archive/master.zip

on a windows machine with python 3.6 and everything works as expected.

iwpnd avatar Jul 20 '18 10:07 iwpnd

@iwpnd

Ah - works perfectly with a fresh install, thanks for the help.

tschlach avatar Jul 20 '18 12:07 tschlach

I'm trying to build this package for conda-forge, but the build is failing on Windows for the same reason mentioned here.

CurtLH avatar Jul 21 '18 02:07 CurtLH

@CurtLH have you done anything mentioned in this issue to fix the problem?

iwpnd avatar Jul 23 '18 12:07 iwpnd

@iwpnd -- It seems that the issue has been fixed with this PR but a new version has not yet been uploaded to PyPI or tagged on GitHub.

If you're not familiar with the process at Conda-Forge, recipes should be build from tarballs, not repos. So for now, I've added the Linux and OSX versions of the packages, and as soon as a new release is created, I will add Windows to the Conda-Forge recipe.

CurtLH avatar Jul 23 '18 16:07 CurtLH

@CurtLH I see, thanks for the enlightenment :)

iwpnd avatar Jul 23 '18 17:07 iwpnd

pip install wasn't working for me so I had to do easy_install https://github.com/elyase/geotext/archive/master.zip (or you could do pip install git+https://github.com/elyase/geotext.git)

Everything is working now. Windows, Python 3.6

VanessaVanG avatar Jul 29 '18 18:07 VanessaVanG