evals
evals copied to clipboard
Windows path and unicode decoding
Hi, I am trying to contribute and get access to GPT-4 by creating my own evals but I thought that I need to be able to run evals before starting. So, I was trying to figure out how to run an eval following one of your examples, "lafand-mt.ipynb", when I found out two problems that resulted in errors for me.
- I am using Windows and this is a problem caused by my OS using "" instead of "/" as directory delimiter. I believe there should be OS-dynamic solutions to use them interchangeably. On code block 3, line 13, the code
langs = input_path.split('/')[-1]
would find the '-' in the path "...\lafand-mt" and thus bring three elements inlangs.split('-')
. For instance, [ "...\data\lafand", "mt\en", "amh"]. This breaks the following line as the output has three elements and is not in the expected formatinput_lang, output_lang = langs.split('-')
. I was able to bodge it by changing '/' to '\' but this should not be the community-standard solution. Furthermore, I would not want Windows users who do not know about this to get lost while following your example. - When running the 6th code block, I got a
UnicodeDecodeError
. I do not know if this happens to other users but I suggest that you add to the main branchencoding='utf-8'
as another parameter for.open()
in line 6 as it seems to get rid of the error. Keep up the good work!
This is partly related to.
- #209
@ulasdilek
I'm trying to make a PR for this.
The PR will address the separator issue by using os.path.sep
instead.