practical-pytorch icon indicating copy to clipboard operation
practical-pytorch copied to clipboard

Unicode Problem

Open herleeyandi opened this issue 7 years ago • 4 comments

Hi I am really new in pytorch. First run this step I got this error. Anybody have suggestion how can I fix this? -Thank you- image

herleeyandi avatar Jun 29 '17 19:06 herleeyandi

Got it, I change to this :

def unicode_to_ascii(s):
    try:
        s = unicode(s, 'utf-8')
    except NameError:
        pass
    return ''.join(
        c for c in unicodedata.normalize('NFD', s)
        if unicodedata.category(c) != 'Mn'
        and c in all_letters
    )
print(unicode_to_ascii('Ślusàrski'))

herleeyandi avatar Jun 29 '17 19:06 herleeyandi

But now I have this issue where the element of array always begin with u''. Anybody know how to resolve this? -Thank you- image

herleeyandi avatar Jun 29 '17 20:06 herleeyandi

The u'' is just to show they are Unicode strings.

spro avatar Jul 01 '17 16:07 spro

you could do it by importing open function from io module and adding encoding='utf-8' :

from io import open

def read_langs(lang1, lang2, reverse=False):
    print("Reading lines...")

    # Read the file and split into lines
    filename = '../data/%s-%s.txt' % (lang1, lang2)
    # filename = '../%s-%s.txt' % (lang1, lang2)
    lines = open(filename, encoding='utf-8').read().strip().split('\n')
    # ......

congruili avatar Dec 09 '17 20:12 congruili