evals
evals copied to clipboard
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
The oaieval.py still preformed the very first version of the dataset after I updated the jsonl file as the program only execute one item while I have three in my...
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...
Update PULL_REQUEST_TEMPLATE.md and add eval categories # Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will...
Changes: - Refactor the existing caching mechanism in evals/utils.py to utilize a more efficient and flexible data structure, such as an LRU cache, to store prompt and evaluation results. -...
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...
# Thank you for contributing an eval! ♥️ 🚨 Please make sure your PR follows these guidelines, __failure to follow the guidelines below will result in the PR being closed...