gpt-2-simple
gpt-2-simple copied to clipboard
can't encode character
[898 | 64206.12] loss=0.76 avg=0.84
[899 | 64275.59] loss=0.40 avg=0.84
[900 | 64345.04] loss=0.53 avg=0.83
======== SAMPLE 1 ========
Traceback (most recent call last):
File "C:\tmp\btc_all\btc_LUT\generators\gen_infintext_gpt2.py", line 27, in <module>
File "C:\Python37\lib\site-packages\gpt_2_simple\gpt_2.py", line 334, in finetune
generate_samples()
File "C:\Python37\lib\site-packages\gpt_2_simple\gpt_2.py", line 309, in generate_samples
fp.write('\n'.join(all_text))
File "C:\Python37\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0e4e' in position 1908: character maps to <undefined>
could solve it by "test".encode("utf-8","ignore")
I fought with this myself. It has to do with the default encoding. As detailed here you can fix it by setting PYTHONUTF8=1 in System Properties > Advanced > Environment Variables
I think you need to put these two into the beginning of the file, and save it as utf
#!/usr/bin/env python3
# -*- coding: utf-8 -*-