colorama icon indicating copy to clipboard operation
colorama copied to clipboard

Invalid unicode string handling

Open tartley opened this issue 10 years ago • 2 comments

Migrated from https://code.google.com/p/colorama/issues/detail?id=21 Reported by av1024, Feb 23, 2011

What steps will reproduce the problem?

  1. import and initalize colorama with deafults
  2. print u"Some non-ASCII text ТЕСТ Русского"

What is the expected output? Some non-ASCII text ТЕСТ Русского

What do you see instead? UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)

What version of the product are you using? On what operating system? Python 2.7 (x32) Windows 7 x64 Untimate (with Eng/Rus locales)

Please provide any additional information below. Looks like wrapped write method hoes not inherit/use original stdout encoding. Possible fixes are (two ways):

A. use sys.setdefaultencoding(Your-Console-OEM-Encoding) # Wrong way IMHO. I don't know the simple method to determine right console mode (ANSI/OEM) and OEM encoding except reading 'stdout.encoding' property

B. Patch ansitowin32 to force-encode unicode output before .write:

--- D:\lg\py\colorama-0.1.18\colorama\ansitowin32.py    Tue May 18 14:43:54 2010
+++ ansitowin32.py  Wed Feb 23 19:10:40 2011
@@ -144,7 +144,10 @@

def write_plain_text(self, text, start, end):
    if start < end:
        self.wrapped.write(text[start:end])
        if isinstance(text, unicode):
            self.wrapped.write(text[start:end].encode(self.wrapped.encoding))
        else:
            self.wrapped.write(text[start:end])
        self.wrapped.flush()

tartley avatar Feb 16 '15 11:02 tartley

A possibly-easier-to-reproduce test case:

A better string is a single i with acute accent like 'í'

>>> import colorama
>>> colorama.init()
>>> s = u'í'
>>> print repr(s)
u'\xed'
>>> print s
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 35, in write
    self.__convertor.write(text)
  File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 116, in write
    self.write_and_convert(text)
  File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 143, in write_and_convert
    self.write_plain_text(text, cursor, len(text))
  File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 148, in write_plain_text
    self.wrapped.write(text[start:end])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 0:
ordinal not in range(128)

We found a similar colorama traceback in Nikola, the static blog generator ( https://github.com/getnikola/nikola/issues/1288 )

tartley avatar Feb 16 '15 11:02 tartley

How to solve this issue?

rshopnil avatar Oct 17 '19 20:10 rshopnil