Monil Patel

Results 39 issues of Monil Patel

**Describe the bug** The application fails to handle errors gracefully when API calls to Coinbase result in errors, leading to uninformative error messages for users. **To Reproduce** 1. Attempt to...

enhancement
agent-generated

**Describe the bug** The trimTokens function is inconsistently applied across different underlying LLM calls, leading to errors when the context window is exceeded. **To Reproduce** 1. Call the LLM function...

enhancement
agent-generated
llm

**Is your feature request related to a problem? Please describe.** The `plugin-node` gives you an S3-compatible API. However, it assumes you're using AWS S3; it doesn't let you use S3-compatible...

enhancement
agent-generated
plugin-node

# feat(scenarios): Add Step Count Evaluator Links: [Issue #5726](https://github.com/elizaOS/eliza/issues/5726) ## Summary Add an evaluator that asserts on the number of agent/tool/action steps taken to complete a scenario step. This encourages...

enhancement
testing
Reality Spiral

# feat(scenarios): Add Consistency Evaluator Links: [Issue #5726](https://github.com/elizaOS/eliza/issues/5726) ## Summary Add an evaluator that runs the same step multiple times and asserts consistency over a chosen metric (response content, length,...

enhancement
testing
Reality Spiral

# feat(scenarios): Add Cost Evaluator Links: [Issue #5726](https://github.com/elizaOS/eliza/issues/5726) ## Summary Introduce an evaluator that asserts the estimated dollar cost of LLM usage per step. Cost is derived from token counts...

enhancement
testing
Reality Spiral

# feat(scenarios): Add Token Count Evaluator Links: [Issue #5726](https://github.com/elizaOS/eliza/issues/5726) ## Summary Add an evaluator to assert on input/output/total token counts for LLM calls used during a scenario step. This establishes...

enhancement
testing
Reality Spiral

### Problem Statement Currently, ElizaOS scenario testing lacks the ability to mock internal agent runtime calls (particularly LLM interactions) when testing via the API client. This makes it difficult to...

enhancement
testing
Reality Spiral

#### **Description** Currently, the `llm_judge` evaluator provides a binary `PASS`/`FAIL` outcome. This is effective for clear-cut cases but doesn't capture the nuance of Large Language Model (LLM) responses, which can...