evals issues

Results 428 evals issues

Sort by recently updated

I believe that in order for GPT-4 to create by itself many inventions through two month, necessary that...

### Discussed in https://github.com/openai/evals/discussions/621 Originally posted by **55255ru** April 10, 2023 Hello. I suggest using what is written on my website (which is here http://www.55255.ru) to improve GPT-4 because I...

55255ru

Music evals

It could be interesting to explore if we could use [MusPy](https://salu133445.github.io/muspy/) to add some text/symbolic music evals. /cc @salu133445

bhack

Idea for Eval

Add BigBench Tasks for evaluation

Hi would be cool to valuate all openai models on Beyond the Imitation Game Benchmark (BIG-bench) which is a collaborative benchmark intended to probe large language models and extrapolate their...

Muhtasham

Idea for Eval

Idea for Evals: Complex, multi-turn instruction-following Evals

Hello everyone, thank you for contributions so far, I've been working through them and these tasks are forming a challenging a comprehensive benchmark for modern LLMs and LLM programs. We...

andrew-openai

Idea for Eval

Dependabot configuration to update actions in workflows

Noticed a few actions used in the workflows here are outdated, proposing a Dependabot configuration to update - reference https://docs.github.com/en/actions/security-guides/using-githubs-security-features-to-secure-your-use-of-github-actions#keeping-the-actions-in-your-workflows-secure-and-up-to-date Current workflow executions have a deprecation notice ex. https://github.com/openai/evals/actions/runs/8903656117 >...

ScottBrenner

evals
evals copied to clipboard

Metadata

I believe that in order for GPT-4 to create by itself many inventions through two month, necessary that...

Music evals

Add BigBench Tasks for evaluation

Idea for Evals: Complex, multi-turn instruction-following Evals

Dependabot configuration to update actions in workflows

Support for GPT-4o

[eval] Add IMO problems with exact answers

What is this

← Metadata

Owner

Metadata

evals evals copied to clipboard

Metadata

← Metadata

Owner

Metadata

evals
evals copied to clipboard