Add more benchmarks
Add benchmarks for DeepSeek V3.1 and Qwen3-Coder
Can I take this task?
I have not seen gpt5 on the leaderboard either?
@jakob1379 👉 https://github.com/Aider-AI/aider/pull/4475
GPT-5 is there now, but not gpt-5-mini
Would love to see the gpt oss models as well
Other new DeepSeek models are missing now too.
- DeepSeek-3.1-Terminus
- DeepSeek-3.2 as well ase the latest Claude
- Claude Sonnet 4.5
I know that model vendors report Aider Polyglot benchmark results themselves, but since those benchmars are missing from the official benchmarks, it's not clear if user can trust them.
GLM-4.6 and Claude Sonnet 4.5 benchmarks?
Other new DeepSeek models are missing now too.
* DeepSeek-3.1-Terminus * DeepSeek-3.2 as well ase the latest Claude * Claude Sonnet 4.5I know that model vendors report Aider Polyglot benchmark results themselves, but since those benchmars are missing from the official benchmarks, it's not clear if user can trust them.
deepseek benchies were merged