evals
evals copied to clipboard
oaieval hangs a lot
Describe the bug
oaieval
hangs near the end, before reporting, a lot.
To Reproduce
✗ EVALS_THREADS=12 EVALS_THREAD_TIMEOUT=10 oaieval gpt-3.5-turbo myeval
[2023-06-28 18:09:56,280] [registry.py:266] Loading registry from /Users/username/development/evals/evals/registry/evals
[2023-06-28 18:09:56,615] [registry.py:266] Loading registry from /Users/username/.evals/evals
[2023-06-28 18:09:56,617] [oaieval.py:138] Run started: runid
[2023-06-28 18:09:56,618] [data.py:83] Fetching myeval/samples.jsonl
[2023-06-28 18:09:56,619] [eval.py:33] Evaluating 69 samples
[2023-06-28 18:09:56,627] [eval.py:139] Running in threaded mode with 10 threads!
99%|█████████████████████████████████████████████████████████████████████████ | 68/69 [00:20<00:00, 7.93it/s]
style of call hangs at this point for many minutes, even though EVALS_THREAD_TIMEOUT
is set to 10 seconds. This is devastating to eval turnaround time.
Code snippets
No response
OS
macOS Ventura (13.4)
Python version
3.11.3
Library version
1.0.3
Indeed, I have encountered a similar issue. It appears that the output might take an extended period to return. In my specific experience, the model only completed its return after generating all 4097 tokens. To address this problem, I found success in augmenting the EVALS_THREAD_TIMEOUT to 500, as the previous setting of 100 was insufficient for my requirements.