lm-evaluation-harness avoid timeout errors with high concurrency in api

avoid timeout errors with high concurrency in api_model

Open dtrawins opened this issue 1 year ago • 2 comments

Longer timeout will avoid errors when running tests with high concurrency

Sep 16 '24 13:09 dtrawins

All committers have signed the CLA.

Sep 16 '24 13:09 CLAassistant

Thanks v. much for the PR. left a comment. Also cc: @artemorloff, as he had also made the change in #2249.

Sep 17 '24 10:09 baberabb

sorry @dtrawins . Forgot to commit the review message.

Sep 26 '24 20:09 baberabb

Are there any blockers here? I also had to apply this fix, and it seems to make sense to merge sth like that, no?

Oct 17 '24 07:10 jmkuebler

When doing long evals on big models such as llama 70b it fails even with low concurrency (e.g. 4). I'm using 8xA100-80GB and it would time out.

I had to fix it this way:

echo "import aiohttp.client
aiohttp.client.DEFAULT_TIMEOUT = aiohttp.ClientTimeout(total=14400)" >> ~/lm-evaluation-harness/lm_eval/__init__.py

Dec 03 '24 19:12 aldopareja