lm-evaluation-harness icon indicating copy to clipboard operation
lm-evaluation-harness copied to clipboard

avoid timeout errors with high concurrency in api_model

Open dtrawins opened this issue 1 year ago • 2 comments

Longer timeout will avoid errors when running tests with high concurrency

dtrawins avatar Sep 16 '24 13:09 dtrawins

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 16 '24 13:09 CLAassistant

Thanks v. much for the PR. left a comment. Also cc: @artemorloff, as he had also made the change in #2249.

baberabb avatar Sep 17 '24 10:09 baberabb

sorry @dtrawins . Forgot to commit the review message.

baberabb avatar Sep 26 '24 20:09 baberabb

Are there any blockers here? I also had to apply this fix, and it seems to make sense to merge sth like that, no?

jmkuebler avatar Oct 17 '24 07:10 jmkuebler

When doing long evals on big models such as llama 70b it fails even with low concurrency (e.g. 4). I'm using 8xA100-80GB and it would time out.

I had to fix it this way:

echo "import aiohttp.client
aiohttp.client.DEFAULT_TIMEOUT = aiohttp.ClientTimeout(total=14400)" >> ~/lm-evaluation-harness/lm_eval/__init__.py

aldopareja avatar Dec 03 '24 19:12 aldopareja