evals issues

Building an MMLU Eval issue

1

I am trying to execute the Building an MMLU Eval jupyter notebooks all of the cells execute correctly until I execute the following code: !oaieval gpt-3.5-turbo match_mmlu_anatomy I receive the...

ToadTWP666

small bug in sample logic eval

2

Set of pull requests seems to be growing pretty fast, so not confident I should add another for such a small thing. Question is, "The day before yesterday, Chris was...

DustinWehr

Assessment of certain specific topics.

1

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

Eric0101

Add influential battle eval

1

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

devonthomas35

MYIDM

1

myidmrocks

binary count

1

## Eval details 📑 ### Eval name binary_count ### Eval description This makes the model count 1s in a 10-100 long binary string. ### What makes this a useful eval?...

niklasnolte

windows event viewer categorization eval

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

JavierSantanaNYC

add draw_svg evaluation

1

evaluate gpt to generate svg code for shapes from text inputs # Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to...

Andrechang

Add shape-in-shape eval

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

pawel-krzych

[Eval] Extract day of week from date

1

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

johnnykohl

evals
evals copied to clipboard

Metadata

Building an MMLU Eval issue

small bug in sample logic eval

Assessment of certain specific topics.

Add influential battle eval

MYIDM

binary count

windows event viewer categorization eval

add draw_svg evaluation

Add shape-in-shape eval

[Eval] Extract day of week from date

← Metadata

Owner

Metadata

evals evals copied to clipboard

Metadata

← Metadata

Owner

Metadata

evals
evals copied to clipboard