makemore
makemore copied to clipboard
Added --input-file-encoding as a command line argument
I wanted to train the program on making more Swedish names. They contain special characters like Å and Ö, so I need to read the file using utf-8. On windows (at least on my machine) this is a problem since default encoding is cp1252, so it doesn't work. So I added a command line argument so I can specify the encoding.
Wrong
python .\makemore.py -i .\swe_names.txt -o swe_names
number of unique characters in the vocabulary: 55
vocabulary:
-ABCDEFGHIJKLMNOPRSTUVWYabcdefghijklmnopqrstuvxy¥©¶Ã–…
Correct
python .\makemore.py -i .\swe_names.txt -o swe_names --input-file-encoding utf-8
number of unique characters in the vocabulary: 54
vocabulary:
-ABCDEFGHIJKLMNOPRSTUVWYabcdefghijklmnopqrstuvxyÅÖåéö
Btw, watching all of your videos on YT, they are great!
I agree. The first thing I did when experimenting with makemore was adding that option to let it generate French words.