agenta
agenta copied to clipboard
The all-in-one LLM developer platform: prompt management, evaluation, human feedback, and deployment all in one place.
**Describe the bug** Currently, there is no error message when serving the app with incorrect types of the configuration parameters. Although the app is served, it does not display the...
The current human evaluation and annotation view has several limitations that need to be addressed to improve user experience and functionality: 1. **Feedback Types:** Currently, only one feedback type (a...
When errors happen in the evaluation view, there are a couple of problems: - The error in the llm app are shown in black and not in red as expected...
**Is your feature request related to a problem? Please describe.** We want to be able to cancel a currently running evaluation job by specifying the evaluation ID and job ID....
#### Current Workflow Issue: At present, our workflow for evaluating datasets using the LLM (Large Language Model) application is sequential and less efficient than it could be. The process follows...
The CLI is a crucial component of Agenta. We plan to conduct unit and integration tests to guarantee the code quality of the CLI and its compatibility with any backend...
We need to ensure that the CLI tests start running when a PR is raised. To accomplish this, implement an action workflow to run the CLI tests when a PR...
RestrictedPython comes with lots of limitations, that makes lots of use cases unfeasible. One solution is to use https://github.com/glotcode/docker-run/ which provides a quick way to create docker containers to run...
[Edge case] After editing an evaluator config, new evaluations are not comparable to old evaluations
We have an evaluator with setting x We run the evaluation We edit the evaluator setting to y The results displayed are not very accurate if we display the evaluator...