Usama

Results 82 comments of Usama

For this eval, GPT-3.5 doesn't perform well for grading. So, GPT-4 should be used as the grader model.

You should see GPT-4 API access enabled in your account in the next few days.

> Thank you for your feedback @usama-openai. I have reverted the changes in the evals/cli/oaievalset.py file as requested. I would appreciate if you could review this and let me know...

You should see GPT-4 API access enabled in your account in the next few days.

You should see GPT-4 API access enabled in your account in the next few days.

You should see GPT-4 API access enabled in your account in the next few days.

> Those are all unique plays - without any permutations, which will make this a really long "`Includes`" list. You don't need to add all the moves to the list....

Thanks for implementing the requested changes. This PR is almost good. If you have no issues, can you place the generation script in the `evals/registry/data/backgammon/` directory? That'll make it easy...

Thanks for implementing the requested changes. I'm getting the following error while evaluating this PR. ``` b'File "/content/evals/evals/eval.py", line 149, in get_samples' b'return get_jsonl(self.samples_jsonl)' b'File "/content/evals/evals/data.py", line 114, in get_jsonl'...

I'm getting the following error now while evaluating this PR: ``` b'File "/usr/lib/python3.10/json/__init__.py", line 346, in loads' b'return _default_decoder.decode(s)' b'File "/usr/lib/python3.10/json/decoder.py", line 337, in decode' b'obj, end = self.raw_decode(s, idx=_w(s,...