evals
evals copied to clipboard
Problem of oaieval.py
The oaieval.py still preformed the very first version of the dataset after I updated the jsonl file as the program only execute one item while I have three in my dataset. It looks like the first version had been cached, cause the content in log file created by record.py also didn't the match the dataset, but the content of the first version of my dataset (which means the log file contain content various from dataset). How can I help with this?
I'm currently having the same problem. I'm unsure of how to clear the cache to get it to "see" my new jsonl file. I've tried to change the name of the test to no avail
EDIT: After running with --debug, I found it was loading a pkl file from my /tmp folder. After deleting it I got it to run all my samples.
The behaviour is explained in custom-eval.md:
If you notice evals has cached your data and you need to clear that cache, you can do so with
rm -rf /tmp/filecache.
This will be a difficult spot for people looking through custom-eval I think. I've also noticed that the flag --no-cache exists in the list of oaieval options. I'm not sure if this is intended to prevent .jsonl files from being cached (which would be a nice feature), but as far as I can tell the code associated with the arg isn't doing anything:
https://github.com/openai/evals/blob/19bfdba1fd9027f6818672e35615f45584de8b02/evals/cli/oaieval.py#L173-L175
Though of course @andrew-openai will know a lot more than I do!
Thanks, it helps.
The behaviour is explained in custom-eval.md:
If you notice evals has cached your data and you need to clear that cache, you can do so with
rm -rf /tmp/filecache.This will be a difficult spot for people looking through custom-eval I think. I've also noticed that the flag
--no-cacheexists in the list of oaieval options. I'm not sure if this is intended to prevent .jsonl files from being cached (which would be a nice feature), but as far as I can tell the code associated with the arg isn't doing anything:https://github.com/openai/evals/blob/19bfdba1fd9027f6818672e35615f45584de8b02/evals/cli/oaieval.py#L173-L175
Though of course @andrew-openai will know a lot more than I do!