Stefano Amorelli issues

Results 9 issues of


                                            Stefano Amorelli

Specify destination folder

Hey, your package rocks! :100: Just wondering if you would accept a PR that implements a destination folder option for the generated .vue I would use that feature since I...

enhancement

Commits are not colored when using terminal color scheme

I'm using WezTerm with `rebecca` color scheme, if I don't have `set termguicolors` in my `init.vim` all the commits and commits' dates are white. Is there anyway to support the...

[Docs] Add Dart client

Add [Dart bindings client](https://github.com/stefanoamorelli/kafka_dart) in the `README.md`.

The REST API already has [eval-related endpoints](https://github.com/google/adk-go/blob/main/server/restapi/routers/eval.go) (`/apps/{app_name}/eval_sets`, `/apps/{app_name}/eval_results`) but they're all stubbed with `Unimplemented` handlers. There's currently no way to systematically evaluate agent performance. Without this, we're stuck manually...

feat(evaluation): add core evaluation framework

This PR introduces an evaluation framework for testing and measuring AI agent performance, it supports both algorithmic and LLM-as-Judge evaluation methods, with built-in support for response quality, tool usage, safety,...

Add nucleus sampling (`top-p`) method

Implement nucleus sampling (top-p sampling) as a new sampling method in the Gemma text generation toolkit. This addresses the a gap in `gemma/gm/text/__init__.py:34` and provides the missing sampling strategy. ##...

[Frontend] Linting

I've noticed that with a simple `eslint` configuration, `frontend/` returns `✖ 248 problems (70 errors, 178 warnings)`. Especially as the codebase grows, I would suggest to tackle this issue sooner...

awaiting-op-response

fix: display eval status per metric type

When viewing eval results, if `response_match_score` failed but `tool_trajectory_avg_score` passed, all messages in the invocation (including tool calls) incorrectly showed ❌. This can be confusing because the tool trajectory is...

[FEATURE] Multi-agent pattern: `Arena`

### Problem Statement AFAIK we don't have a pattern for: "I'm not sure which approach is best at performing a task beforehand, **so try several at the same time** and...

enhancement