Mahmoud Mabrouk

Results 116 issues of Mahmoud Mabrouk

Currently only the outputs for spans are shown. From [SyncLinear.com](https://synclinear.com) | [AGE-154](https://linear.app/agenta/issue/AGE-154/[fea]-show-the-inputs-for-spans)

enhancement
Task

Creating an app from code. If using the line *model_1*=ag.GroupedMultipleChoiceParam( *default*="gpt-3.5-turbo", *choices*=get_all_supported_llm_models()), We get the warning when running the app from shell (or in the lambda/docker logs) From [SyncLinear.com](https://synclinear.com) |...

bug
Low priority
2 points

To reproduce: - Create a test set with a correct_answer - Run an evaluation The evaluation will fail and the result will not be seen Expected behavior: The evaluation will...

bug
stale

I think we should disable this for cloud and keep it only for oss. @aakrem Tracking issue for: - [x] https://github.com/Agenta-AI/agenta/security/code-scanning/26

stale

In the playground, each time we add a datapoint to a test set, the default selected option in the multi-select down below is the "+Add new". For users who are...

enhancement
good first issue
low complexity
Task
Low priority
Subtask
2 points

In entity recognition tasks, users often need to evaluate multiple outputs. For example, if the task is to extract the author and date from a PDF, the user might create...

enhancement
dev experience

**Is your feature request related to a problem? Please describe.** Using llms without streaming is slow. **Describe the solution you'd like** Add a feature to stream outputs to the SDK...

dev experience
SDK
linear

We would like to enable the users to run the evaluation workflow from the CLI (without going through the UI). The UI in this case would be mostly used to...

Roadmap
December 2023

We would like to the table for evaluation comparison view at the header of each output the data about the configuration that is used for that evaluation. Tasks: - [...

Frontend
incomplete description
evaluation
design needed
linear

- [ ] #1045 (requires #1044) - [x] #1066 - [ ] #941 - [x] #940 - [ ] Create template with images - [ ] Create link to code...

Roadmap