evals issues

Results 428 evals issues

Sort by recently updated

Add index list eval

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

wiskojo

Historical events in the right order

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

Tor1mo

Kaomoji Recognition Eval (53% accuracy)

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

erssss

[bugfix] fix includes eval

Bug fix to the basic.includes eval: If a ref in sample["ideal"] is a single character, `evals.elsuite.utils.get_answer` can return an empty string if the ref is found in the last character...

niklasnolte

Eval: Banking77

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

Ouassimf

Eval: Added Repeating Consonants Eval

## Eval details 📑 ### Eval name `repeat_consonants` ### Eval description Tests the model's ability to repeat consonants x number of times in a given text. ### What makes this...

MikelCalvo

Rhyming words in a different language (Hebrew)

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

ytsaig

Add Poker heads-up pre-flop eval

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

CoryLegend

ASCII Word Art to Text

# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...

DereWah

Add last-word-nth test

## Eval details 📑 ### Eval name last-word-nth ### Eval description Test the model's ability to tell what the last word of a sentence is, but by asking it indirectly...

PopFlamingo

evals
evals copied to clipboard

Metadata

Add index list eval

Historical events in the right order

Kaomoji Recognition Eval (53% accuracy)

[bugfix] fix includes eval

Eval: Banking77

Eval: Added Repeating Consonants Eval

Rhyming words in a different language (Hebrew)

Add Poker heads-up pre-flop eval

ASCII Word Art to Text

Add last-word-nth test

← Metadata

Owner

Metadata

evals evals copied to clipboard

Metadata

← Metadata

Owner

Metadata

evals
evals copied to clipboard